Liangjun He replied on jira that he wants to join the work.

We plan to schedule an online meeting recently to discuss it.

Will post the meeting schedule here when we find a suitable time.

Feel free to join if you are interested.

Thanks.

张铎(Duo Zhang) <[email protected]> 于2022年6月16日周四 22:07写道:

> Thanks Andrew for the hard work on closing stale issues and let me bump
> this thread...
>
> 张铎(Duo Zhang) <[email protected]> 于2022年6月12日周日 21:25写道:
>
>> The issue for this is HBASE-27109[1], and it is a sub task for
>> HBASE-15867[2], where we want to remove the dependency on zk for
>> replication implementation. If HBASE-15867 is done, there is no permanent
>> state on zk any more, which means we are always safe to rebuild a cluster
>> with a fresh zk instance.
>>
>> The related issues have been opened long ago, such
>> as HBASE-10295[3], HBASE-13773[4], etc. HBASE-15867 nearly solved the
>> problem as we have already abstract a replication peer storage interface
>> and also a replication queue storage interface, the idea is to have two
>> table based storages then we can solve the problem. But then we find out
>> there is still a cyclic dependency which could fail the startup of a
>> cluster. In the current replication implementation, once we create a new
>> WAL writer, we need to record it in the replication queue storage, before
>> writing data to it. But if we move the replication queue storage to a hbase
>> table, then we need this table to be writable first, then we can record the
>> new WAL file in it. On a new cluster, this will hang the cluster start up
>> as besides hbase:meta, no region can be online...
>>
>> In HBASE-27109, I proposed a new way to track the WAL files. Please see
>> the design doc[5] for more details. You may find out that the
>> implementation of claim queues and replication log cleaner become more
>> complicated. This is a trade off, if we want to make the life when writing
>> and tracking WAL easier, then we need to deal with the complexity in other
>> places. But I think it is worthwhile as writing WAL is on the critical path
>> of our main read/write flow, where claim queues and replication log cleaner
>> are both background tasks.
>>
>> Feel free to reply here, on the jira issue, or on the design doc.
>> Suggestions are always welcomed.
>>
>> 1. https://issues.apache.org/jira/browse/HBASE-27109
>> 2. https://issues.apache.org/jira/browse/HBASE-15867
>> 3. https://issues.apache.org/jira/browse/HBASE-10295
>> 4. https://issues.apache.org/jira/browse/HBASE-13773
>> 5.
>> https://docs.google.com/document/d/1QrSFlDQblxc12aTomE64sVmghrs_g5ys4fU9wGOdMHk/edit?usp=sharing
>>
>

Reply via email to