Anyway I don’t wish to discuss this veto at length. What’s done is done. The 
merge is off for the 2.4 RC. It will proceed without this feature.

> On Nov 16, 2020, at 7:33 AM, Andrew Purtell <[email protected]> wrote:
> 
> Your -1 throws a wrench into an agreement we had to merge this for 2.4. The 
> 2.4 RC has been waiting for two weeks for this reason. It is a rather 
> unfriendly act and frustrating to me as 2.4 RC. I feel this needs to be said. 
> Despite what you claim the reason for hedged reads has been explained here 
> and on the JIRA. 
> 
>> On Nov 15, 2020, at 11:20 PM, 张铎 <[email protected]> wrote:
>> 
>> So what is your purpose of distributing the request of region location
>> lookup? It is just because you want to 'distribute the request of region
>> location lookup'?
>> 
>> Then I'm -1 on merging. We should reach an agreement on what we want to
>> solve before merging at least.
>> 
>> I've helped this issue from the design doc step. For me, the purpose for
>> this issue is clear. We want to prevent the hotspot of meta, so the
>> solution is simple, enable meta replica, and then just modify the client to
>> not always go to primary replica first(this is the old behavior even with
>> meta replica feature on).
>> And this will introduce another problem that, there is no meta region
>> replication implementation for meta read replicas, which means the latency
>> will be large as we can only sync the data between replicas through region
>> flush, so we implement meta region replication.
>> 
>> So I think it is very important to verify that we have truly distributed
>> the request of region location lookup, and also make sure that we could
>> support more requests of region location lookup. Otherwise this feature is
>> useless.
>> 
>> And I agree with Andrew that, since the feature is default off on branch-2
>> and has no regression, it is OK to merge for now. Theoretically our
>> approach here should work, so even it does not work for now, I think we
>> could fix the problems to make it work.
>> 
>> But your reply above made me wonder whether we are talking about the same
>> thing. That's why I'm -1 here. I'm not going to force you to do the test
>> suggested by me, as I said it could be done after merging, just want to
>> reach an agreement on the goal of this feature.
>> 
>> Thanks.
>> 
>> Stack <[email protected]> 于2020年11月16日周一 下午12:35写道:
>> 
>>>> On Sun, Nov 15, 2020 at 9:16 AM Andrew Purtell <[email protected]>
>>>> wrote:
>>>> 
>>>> I agree with Duo’s comment that a performance gain is unlikely but would
>>>> be orthogonal anyway;
>>> 
>>> 
>>> Perf observation is just an aside in the issue. Perf is orthogonal as you
>>> say above (as long as no regression).
>>> 
>>> 
>>> 
>>>> it’s an availability gain that is the goal. We can assume it based on
>>>> theory of operation and unit test results but the gain should be tested
>>> and
>>>> measured on a cluster too.
>>>> 
>>> 
>>> 
>>> The feature is about distributing load on hbase:meta to alleviate
>>> hotspotting; it makes read replicas more live so replicas are more likely
>>> to satisfy location lookups making read replicas more effective. That read
>>> replicas improve HA is presumed -- it was the original justification for
>>> this years old commit -- but HA is not the focus of this addition; hence no
>>> reports on effectiveness in this area.
>>> 
>>> I have no problem working on such tests/reports but suggest that they are
>>> done post merge.
>>> 
>>> 
>>> 
>>>> That said, the results of the testing thus far indicate no regression,
>>>> which gives me confidence to support a merge. Specifically, a merge to
>>>> “unblock” 2.4 (we aren’t really blocked, we are waiting), provided the
>>>> default there is the feature is configured off. But please indicate in
>>>> documentation and release notes that the feature is not widely tested
>>> yet -
>>>> as is customarily done for new functionality like this.
>>>> 
>>>> 
>>> No problem w/ flagging the feature as new.
>>> 
>>> Thanks,
>>> S
>>> 
>>> 
>>> 
>>>> 
>>>>>> On Nov 15, 2020, at 5:20 AM, 张铎 <[email protected]> wrote:
>>>>>> 
>>>>>> Replied on jira, I think we missed an important scenario when testing.
>>>>>> 
>>>>>> Thanks.
>>>>>> 
>>>>>> Stack <[email protected]> 于2020年11月15日周日 上午2:30写道:
>>>>>> 
>>>>>>> HBASE-18070 makes it so hbase:meta read replicas can run closer to the
>>>>>>> primary, (< second lags rather than minutes). It adds Async WAL
>>>>>>> Replication[1] on the hbase:meta table; i.e. edits are sprayed across
>>>>>>> replicas as they arrive at the primary's WAL. Before this work, Async
>>>>> WAL
>>>>>>> Replication was only available on user-space tables and the only
>>> option
>>>> for
>>>>>> hbase:meta read replicas was reloading the primaries hfiles on a
>>> period
>>>>>> (minutes). HBASE-18070 also adds an optional client-side 'LoadBalance'
>>>>>> policy that favors read replicas ahead of primary reads falling back
>>> to
>>>> the
>>>>>> primary on fault. Together, these additions allow distributing
>>>> hbase:meta
>>>>>> read load across primary and replicas alleviating 'hotspotting'.
>>>>>> 
>>>>>> I would like to merge the feature to master branch Monday evening if
>>>> there
>>>>>> is no objection (Soon after I'll merge to branch-2 so this feature can
>>>>>> hopefully be included in the upcoming 2.4.0RC).
>>>>>> 
>>>>>> * For the design, see [2].
>>>>>> * For an amalgamated PR of the 5 or 6 reviewed PRs that comprise this
>>>>>> feature, see [3].
>>>>>> * For a PE report that compared performance before and after, see
>>>>>> HBASE-25127 (no regression).
>>>>>> * A report on ITBLL runs is pending to be attached to HBASE-18070 but
>>>> runs
>>>>>> so far show no regression with the feature enabled (ITBLL runs were
>>> done
>>>>>> against a backport of this feature to branch-2 as the ITBLL state of
>>>> master
>>>>>> is currently an unknown).
>>>>>> 
>>>>>> Testing continues mainly looking for further improvement and to better
>>>>>> understand this feature in operation. Documentation is included but in
>>>> need
>>>>>> of polish (working on it).
>>>>>> 
>>>>>> Dump any questions here and I'll be happy to respond. If you need more
>>>> time
>>>>>> to review, just shout.
>>>>>> 
>>>>>> Thanks and thanks to all who contributed to this feature; the
>>> reviewers
>>>> and
>>>>>> the testers in particular.
>>>>>> 
>>>>>> S
>>>>>> 
>>>>>> 1. http://hbase.apache.org/book.html#_asnyc_wal_replication
>>>>>> 2.
>>>>>> 
>>>>>> 
>>>> 
>>> https://docs.google.com/document/d/1jJWVc-idHhhgL4KDRpjMsQJKCl_NRaCLGiH3Wqwd3O8/edit#
>>>>>> This patch is currently missing HBASE-25280, a bug found in testing.
>>>>>> 3. https://github.com/apache/hbase/pull/2643
>>>>>> 
>>>> 
>>> 

Reply via email to