On Sun, Nov 15, 2020 at 9:16 AM Andrew Purtell <[email protected]>
wrote:

> I agree with Duo’s comment that a performance gain is unlikely but would
> be orthogonal anyway;


Perf observation is just an aside in the issue. Perf is orthogonal as you
say above (as long as no regression).



> it’s an availability gain that is the goal. We can assume it based on
> theory of operation and unit test results but the gain should be tested and
> measured on a cluster too.
>


The feature is about distributing load on hbase:meta to alleviate
hotspotting; it makes read replicas more live so replicas are more likely
to satisfy location lookups making read replicas more effective. That read
replicas improve HA is presumed -- it was the original justification for
this years old commit -- but HA is not the focus of this addition; hence no
reports on effectiveness in this area.

I have no problem working on such tests/reports but suggest that they are
done post merge.



> That said, the results of the testing thus far indicate no regression,
> which gives me confidence to support a merge. Specifically, a merge to
> “unblock” 2.4 (we aren’t really blocked, we are waiting), provided the
> default there is the feature is configured off. But please indicate in
> documentation and release notes that the feature is not widely tested yet -
> as is customarily done for new functionality like this.
>
>
No problem w/ flagging the feature as new.

Thanks,
S



>
> > On Nov 15, 2020, at 5:20 AM, 张铎 <[email protected]> wrote:
> >
> > Replied on jira, I think we missed an important scenario when testing.
> >
> > Thanks.
> >
> > Stack <[email protected]> 于2020年11月15日周日 上午2:30写道:
> >
> >> HBASE-18070 makes it so hbase:meta read replicas can run closer to the
> >> primary, (< second lags rather than minutes). It adds Async WAL
> >> Replication[1] on the hbase:meta table; i.e. edits are sprayed across
> >> replicas as they arrive at the primary's WAL. Before this work, Async
> WAL
> >> Replication was only available on user-space tables and the only option
> for
> >> hbase:meta read replicas was reloading the primaries hfiles on a period
> >> (minutes). HBASE-18070 also adds an optional client-side 'LoadBalance'
> >> policy that favors read replicas ahead of primary reads falling back to
> the
> >> primary on fault. Together, these additions allow distributing
> hbase:meta
> >> read load across primary and replicas alleviating 'hotspotting'.
> >>
> >> I would like to merge the feature to master branch Monday evening if
> there
> >> is no objection (Soon after I'll merge to branch-2 so this feature can
> >> hopefully be included in the upcoming 2.4.0RC).
> >>
> >> * For the design, see [2].
> >> * For an amalgamated PR of the 5 or 6 reviewed PRs that comprise this
> >> feature, see [3].
> >> * For a PE report that compared performance before and after, see
> >> HBASE-25127 (no regression).
> >> * A report on ITBLL runs is pending to be attached to HBASE-18070 but
> runs
> >> so far show no regression with the feature enabled (ITBLL runs were done
> >> against a backport of this feature to branch-2 as the ITBLL state of
> master
> >> is currently an unknown).
> >>
> >> Testing continues mainly looking for further improvement and to better
> >> understand this feature in operation. Documentation is included but in
> need
> >> of polish (working on it).
> >>
> >> Dump any questions here and I'll be happy to respond. If you need more
> time
> >> to review, just shout.
> >>
> >> Thanks and thanks to all who contributed to this feature; the reviewers
> and
> >> the testers in particular.
> >>
> >> S
> >>
> >> 1. http://hbase.apache.org/book.html#_asnyc_wal_replication
> >> 2.
> >>
> >>
> https://docs.google.com/document/d/1jJWVc-idHhhgL4KDRpjMsQJKCl_NRaCLGiH3Wqwd3O8/edit#
> >> This patch is currently missing HBASE-25280, a bug found in testing.
> >> 3. https://github.com/apache/hbase/pull/2643
> >>
>

Reply via email to