Re: [DISCUSS] Move replication queue storage from zookeeper to a separated HBase table

Duo Zhang Tue, 28 Jun 2022 19:24:28 -0700

We plan to hold an online meeting at 2PM to 3PM, 1st July, GMT +8, using
tencent meeting.


阿米朵 邀请您参加腾讯会议
> 会议主题：HBase Replication Queue Storage
> 会议时间：2022/07/01 14:00-15:00 (GMT+08:00) 中国标准时间 - 北京
>
> 点击链接入会，或添加至会议列表：(Click this url to join the meeting)
> https://meeting.tencent.com/dm/kZQdGasowxXP
>
> #腾讯会议：430-524-288 <---- This is the number of the meeting
> 会议密码：220701 <---- This is the password
>
> 手机一键拨号入会
> +8675536550000,,430524288# (中国大陆)
> +85230018898,,,2,430524288# (中国香港)
>
> 根据您的位置拨号
> +8675536550000 (中国大陆)
> +85230018898 (中国香港)
>
> 复制该信息，打开手机腾讯会议即可参与
>

More attendees are always welcomed :)


张铎(Duo Zhang) <[email protected]> 于2022年6月21日周二 12:46写道：

> Liangjun He replied on jira that he wants to join the work.
>
> We plan to schedule an online meeting recently to discuss it.
>
> Will post the meeting schedule here when we find a suitable time.
>
> Feel free to join if you are interested.
>
> Thanks.
>
> 张铎(Duo Zhang) <[email protected]> 于2022年6月16日周四 22:07写道：
>
>> Thanks Andrew for the hard work on closing stale issues and let me bump
>> this thread...
>>
>> 张铎(Duo Zhang) <[email protected]> 于2022年6月12日周日 21:25写道：
>>
>>> The issue for this is HBASE-27109[1], and it is a sub task for
>>> HBASE-15867[2], where we want to remove the dependency on zk for
>>> replication implementation. If HBASE-15867 is done, there is no permanent
>>> state on zk any more, which means we are always safe to rebuild a cluster
>>> with a fresh zk instance.
>>>
>>> The related issues have been opened long ago, such
>>> as HBASE-10295[3], HBASE-13773[4], etc. HBASE-15867 nearly solved the
>>> problem as we have already abstract a replication peer storage interface
>>> and also a replication queue storage interface, the idea is to have two
>>> table based storages then we can solve the problem. But then we find out
>>> there is still a cyclic dependency which could fail the startup of a
>>> cluster. In the current replication implementation, once we create a new
>>> WAL writer, we need to record it in the replication queue storage, before
>>> writing data to it. But if we move the replication queue storage to a hbase
>>> table, then we need this table to be writable first, then we can record the
>>> new WAL file in it. On a new cluster, this will hang the cluster start up
>>> as besides hbase:meta, no region can be online...
>>>
>>> In HBASE-27109, I proposed a new way to track the WAL files. Please see
>>> the design doc[5] for more details. You may find out that the
>>> implementation of claim queues and replication log cleaner become more
>>> complicated. This is a trade off, if we want to make the life when writing
>>> and tracking WAL easier, then we need to deal with the complexity in other
>>> places. But I think it is worthwhile as writing WAL is on the critical path
>>> of our main read/write flow, where claim queues and replication log cleaner
>>> are both background tasks.
>>>
>>> Feel free to reply here, on the jira issue, or on the design doc.
>>> Suggestions are always welcomed.
>>>
>>> 1. https://issues.apache.org/jira/browse/HBASE-27109
>>> 2. https://issues.apache.org/jira/browse/HBASE-15867
>>> 3. https://issues.apache.org/jira/browse/HBASE-10295
>>> 4. https://issues.apache.org/jira/browse/HBASE-13773
>>> 5.
>>> https://docs.google.com/document/d/1QrSFlDQblxc12aTomE64sVmghrs_g5ys4fU9wGOdMHk/edit?usp=sharing
>>>
>>

Re: [DISCUSS] Move replication queue storage from zookeeper to a separated HBase table

Reply via email to