Folks,

    As I explained in the jira, before our load hits the region server's
limit, the throughput is going to be similar for primary-only and meta
replica Load Balance mode. We are examining  our test today, trying to push
up the load and
will come back.

    Best Regards,

    Huaxiang Sun



On Mon, Nov 16, 2020 at 8:15 AM Andrew Purtell <[email protected]> wrote:

> I see, sure we can try to resolve it.
>
> We had two full weeks, from now until the end of the month, for the RC, to
> get it out before end of month. Further delay lessens that time frame but I
> admit it is an arbitrary target and that shouldn't be the top most concern
> given the interest of the contributors as expressed on this thread ("Our
> group were hoping to throw our shoulder behind 2.4 stabilizing so we could
> deploy it to production.").
>
> I commented on the JIRA. Perhaps the difference in technical opinion comes
> down to a replica preference policy alternative that can be resolved with
> follow up work, and that is the way forward.
>
>
> On Mon, Nov 16, 2020 at 7:59 AM Stack <[email protected]> wrote:
>
> > On Mon, Nov 16, 2020 at 7:44 AM Andrew Purtell <[email protected]>
> > wrote:
> >
> > > My apologies, Stack, it's time to move on for 2.4. We can revisit this
> > for
> > > 2.5.
> > >
> > >
> > One more day to allow Duo reconsider (my fault for not making this a VOTE
> > thread)?
> >
> > The work here is mostly that of others. It would be a shame it didn't
> land
> > in 2.4 because of my representation. Our group were hoping to throw our
> > shoulder behind 2.4 stabilizing so we could deploy it to production.
> > Without this feature, we'll have to reconsider.
> >
> > Thanks Andrew,
> > S
> >
> >
> >
> > > On Mon, Nov 16, 2020 at 7:41 AM Stack <[email protected]> wrote:
> > >
> > > > On Sun, Nov 15, 2020 at 11:20 PM 张铎(Duo Zhang) <
> [email protected]>
> > > > wrote:
> > > >
> > > > > So what is your purpose of distributing the request of region
> > location
> > > > > lookup? It is just because you want to 'distribute the request of
> > > region
> > > > > location lookup'?
> > > > >
> > > > > Then I'm -1 on merging. We should reach an agreement on what we
> want
> > to
> > > > > solve before merging at least.
> > > > >
> > > > > I've helped this issue from the design doc step. For me, the
> purpose
> > > for
> > > > > this issue is clear. We want to prevent the hotspot of meta, so the
> > > > > solution is simple, enable meta replica, and then just modify the
> > > client
> > > > to
> > > > > not always go to primary replica first(this is the old behavior
> even
> > > with
> > > > > meta replica feature on).
> > > > > And this will introduce another problem that, there is no meta
> region
> > > > > replication implementation for meta read replicas, which means the
> > > > latency
> > > > > will be large as we can only sync the data between replicas through
> > > > region
> > > > > flush, so we implement meta region replication.
> > > > >
> > > > > So I think it is very important to verify that we have truly
> > > distributed
> > > > > the request of region location lookup, and also make sure that we
> > could
> > > > > support more requests of region location lookup. Otherwise this
> > feature
> > > > is
> > > > > useless.
> > > > >
> > > > > And I agree with Andrew that, since the feature is default off on
> > > > branch-2
> > > > > and has no regression, it is OK to merge for now. Theoretically our
> > > > > approach here should work, so even it does not work for now, I
> think
> > we
> > > > > could fix the problems to make it work.
> > > > >
> > > > >
> > > > Please undo your -1. We can work on differing understandings in JIRA
> > > while
> > > > I work on the report you suggested and while 2.4.0RC proceeds.
> > > >
> > > > S
> > > >
> > > >
> > > >
> > > > > But your reply above made me wonder whether we are talking about
> the
> > > same
> > > > > thing. That's why I'm -1 here. I'm not going to force you to do the
> > > test
> > > > > suggested by me, as I said it could be done after merging, just
> want
> > to
> > > > > reach an agreement on the goal of this feature.
> > > > >
> > > > > Thanks.
> > > > >
> > > > > Stack <[email protected]> 于2020年11月16日周一 下午12:35写道:
> > > > >
> > > > > > On Sun, Nov 15, 2020 at 9:16 AM Andrew Purtell <
> > > > [email protected]
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > I agree with Duo’s comment that a performance gain is unlikely
> > but
> > > > > would
> > > > > > > be orthogonal anyway;
> > > > > >
> > > > > >
> > > > > > Perf observation is just an aside in the issue. Perf is
> orthogonal
> > as
> > > > you
> > > > > > say above (as long as no regression).
> > > > > >
> > > > > >
> > > > > >
> > > > > > > it’s an availability gain that is the goal. We can assume it
> > based
> > > on
> > > > > > > theory of operation and unit test results but the gain should
> be
> > > > tested
> > > > > > and
> > > > > > > measured on a cluster too.
> > > > > > >
> > > > > >
> > > > > >
> > > > > > The feature is about distributing load on hbase:meta to alleviate
> > > > > > hotspotting; it makes read replicas more live so replicas are
> more
> > > > likely
> > > > > > to satisfy location lookups making read replicas more effective.
> > That
> > > > > read
> > > > > > replicas improve HA is presumed -- it was the original
> > justification
> > > > for
> > > > > > this years old commit -- but HA is not the focus of this
> addition;
> > > > hence
> > > > > no
> > > > > > reports on effectiveness in this area.
> > > > > >
> > > > > > I have no problem working on such tests/reports but suggest that
> > they
> > > > are
> > > > > > done post merge.
> > > > > >
> > > > > >
> > > > > >
> > > > > > > That said, the results of the testing thus far indicate no
> > > > regression,
> > > > > > > which gives me confidence to support a merge. Specifically, a
> > merge
> > > > to
> > > > > > > “unblock” 2.4 (we aren’t really blocked, we are waiting),
> > provided
> > > > the
> > > > > > > default there is the feature is configured off. But please
> > indicate
> > > > in
> > > > > > > documentation and release notes that the feature is not widely
> > > tested
> > > > > > yet -
> > > > > > > as is customarily done for new functionality like this.
> > > > > > >
> > > > > > >
> > > > > > No problem w/ flagging the feature as new.
> > > > > >
> > > > > > Thanks,
> > > > > > S
> > > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > > On Nov 15, 2020, at 5:20 AM, 张铎 <[email protected]>
> wrote:
> > > > > > > >
> > > > > > > > Replied on jira, I think we missed an important scenario
> when
> > > > > testing.
> > > > > > > >
> > > > > > > > Thanks.
> > > > > > > >
> > > > > > > > Stack <[email protected]> 于2020年11月15日周日 上午2:30写道:
> > > > > > > >
> > > > > > > >> HBASE-18070 makes it so hbase:meta read replicas can run
> > closer
> > > to
> > > > > the
> > > > > > > >> primary, (< second lags rather than minutes). It adds Async
> > WAL
> > > > > > > >> Replication[1] on the hbase:meta table; i.e. edits are
> sprayed
> > > > > across
> > > > > > > >> replicas as they arrive at the primary's WAL. Before this
> > work,
> > > > > Async
> > > > > > > WAL
> > > > > > > >> Replication was only available on user-space tables and the
> > only
> > > > > > option
> > > > > > > for
> > > > > > > >> hbase:meta read replicas was reloading the primaries hfiles
> > on a
> > > > > > period
> > > > > > > >> (minutes). HBASE-18070 also adds an optional client-side
> > > > > 'LoadBalance'
> > > > > > > >> policy that favors read replicas ahead of primary reads
> > falling
> > > > back
> > > > > > to
> > > > > > > the
> > > > > > > >> primary on fault. Together, these additions allow
> distributing
> > > > > > > hbase:meta
> > > > > > > >> read load across primary and replicas alleviating
> > 'hotspotting'.
> > > > > > > >>
> > > > > > > >> I would like to merge the feature to master branch Monday
> > > evening
> > > > if
> > > > > > > there
> > > > > > > >> is no objection (Soon after I'll merge to branch-2 so this
> > > feature
> > > > > can
> > > > > > > >> hopefully be included in the upcoming 2.4.0RC).
> > > > > > > >>
> > > > > > > >> * For the design, see [2].
> > > > > > > >> * For an amalgamated PR of the 5 or 6 reviewed PRs that
> > comprise
> > > > > this
> > > > > > > >> feature, see [3].
> > > > > > > >> * For a PE report that compared performance before and
> after,
> > > see
> > > > > > > >> HBASE-25127 (no regression).
> > > > > > > >> * A report on ITBLL runs is pending to be attached to
> > > HBASE-18070
> > > > > but
> > > > > > > runs
> > > > > > > >> so far show no regression with the feature enabled (ITBLL
> runs
> > > > were
> > > > > > done
> > > > > > > >> against a backport of this feature to branch-2 as the ITBLL
> > > state
> > > > of
> > > > > > > master
> > > > > > > >> is currently an unknown).
> > > > > > > >>
> > > > > > > >> Testing continues mainly looking for further improvement and
> > to
> > > > > better
> > > > > > > >> understand this feature in operation. Documentation is
> > included
> > > > but
> > > > > in
> > > > > > > need
> > > > > > > >> of polish (working on it).
> > > > > > > >>
> > > > > > > >> Dump any questions here and I'll be happy to respond. If you
> > > need
> > > > > more
> > > > > > > time
> > > > > > > >> to review, just shout.
> > > > > > > >>
> > > > > > > >> Thanks and thanks to all who contributed to this feature;
> the
> > > > > > reviewers
> > > > > > > and
> > > > > > > >> the testers in particular.
> > > > > > > >>
> > > > > > > >> S
> > > > > > > >>
> > > > > > > >> 1. http://hbase.apache.org/book.html#_asnyc_wal_replication
> > > > > > > >> 2.
> > > > > > > >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1jJWVc-idHhhgL4KDRpjMsQJKCl_NRaCLGiH3Wqwd3O8/edit#
> > > > > > > >> This patch is currently missing HBASE-25280, a bug found in
> > > > testing.
> > > > > > > >> 3. https://github.com/apache/hbase/pull/2643
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best regards,
> > > Andrew
> > >
> > > Words like orphans lost among the crosstalk, meaning torn from truth's
> > > decrepit hands
> > >    - A23, Crosstalk
> > >
> >
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>    - A23, Crosstalk
>

Reply via email to