Thanks for the update here and meeting minutes. -n
On Fri, Jul 1, 2022 at 12:46 张铎(Duo Zhang) <[email protected]> wrote: > Meetings notes: > Attendees: Duo Zhang, Liangjun He, Xin Sun, Tianhang Tang, Yu Li > > First Duo explained the design doc again, then others asked some questions > and we discussed, let me post the conclusion here: > > 1. If we rely on an HBase table to store the replication metadata, then how > do we use the replication sync up tool to replicate data to the peer > cluster once the source cluster is fully down? > We agree that this is a limitation compared to the old zookeeper based > implementation. Maybe we could mirror the replication metadata to another > storage system, or use the maintenance mode to bring the hbase:replication > table online? Not a blocker issue but at least we need to clearly document > this. > > 2. Since we removed the zookeeper usage, the pressure to zookeeper will now > be moved to HBase and HDFS, will it cause too much pressure and fail the > cluster under extreme cases? > After discussion, we almost agree the risk is low. The heaviest operation > is claim queue, where we need to list HDFS, but it is the last step of SCP, > where we have already finished WAL splitting, and it will only touch > namenode, so in general it will not add too much pressure. Anyway, when > implementing, we need to be careful, to avoid touching HDFS too much. > > 3. If hbase:replication is offline, will it hang the replication? > This is by design, but we should try our best to not hang the normal > read/write when hbase:replication is offline. > > 4. The sourceServerName in ReplicationQueueId means the last region server > which holds the replication queue? > No, it is the FIRST region server which holds the replication queue. The > old design will track all the region servers which hold the replication > queue in queue id, but actually, we only need the first region server for > locating the WAL files. > > 5. How to predicate the pressure of the new hbase:replication table? > For a normal cluster, the most pressures come from the update of the > replication offset. This could be calculated easily with write_throughput / > replication_size_per_offset_update. Of course, the qps will be doubled if > the number of the replication peers is doubled. > > Later we talked about the general problems for replication, for example, if > we have 20~30 replication peers, not only the pressure of the replication > metadata will be a problem, the pressure on reading HDFS will be a big > problem. We discussed several possible solutions, like only have one thread > to read WAL files, not thread per peer, cache the newest several WAL files > in memory, only have one replication peer to mirror all the WAL data to > kafka, and use kafka to replicate to other systems, etc. Anyway, not > related to the main topic. > > And we all agree that the current design doc is huge and there are still > lots of details in each area. We will open sub tasks to cover the several > areas and also split the design doc to several pieces and keep polishing > it. > > Thanks. > > > 张铎(Duo Zhang) <[email protected]> 于2022年6月29日周三 10:23写道: > > > We plan to hold an online meeting at 2PM to 3PM, 1st July, GMT +8, using > > tencent meeting. > > > > 阿米朵 邀请您参加腾讯会议 > >> 会议主题:HBase Replication Queue Storage > >> 会议时间:2022/07/01 14:00-15:00 (GMT+08:00) 中国标准时间 - 北京 > >> > >> 点击链接入会,或添加至会议列表:(Click this url to join the meeting) > >> https://meeting.tencent.com/dm/kZQdGasowxXP > >> > >> #腾讯会议:430-524-288 <---- This is the number of the meeting > >> 会议密码:220701 <---- This is the password > >> > >> 手机一键拨号入会 > >> +8675536550000,,430524288# (中国大陆) > >> +85230018898,,,2,430524288# (中国香港) > >> > >> 根据您的位置拨号 > >> +8675536550000 (中国大陆) > >> +85230018898 (中国香港) > >> > >> 复制该信息,打开手机腾讯会议即可参与 > >> > > > > More attendees are always welcomed :) > > > > > > 张铎(Duo Zhang) <[email protected]> 于2022年6月21日周二 12:46写道: > > > >> Liangjun He replied on jira that he wants to join the work. > >> > >> We plan to schedule an online meeting recently to discuss it. > >> > >> Will post the meeting schedule here when we find a suitable time. > >> > >> Feel free to join if you are interested. > >> > >> Thanks. > >> > >> 张铎(Duo Zhang) <[email protected]> 于2022年6月16日周四 22:07写道: > >> > >>> Thanks Andrew for the hard work on closing stale issues and let me bump > >>> this thread... > >>> > >>> 张铎(Duo Zhang) <[email protected]> 于2022年6月12日周日 21:25写道: > >>> > >>>> The issue for this is HBASE-27109[1], and it is a sub task for > >>>> HBASE-15867[2], where we want to remove the dependency on zk for > >>>> replication implementation. If HBASE-15867 is done, there is no > permanent > >>>> state on zk any more, which means we are always safe to rebuild a > cluster > >>>> with a fresh zk instance. > >>>> > >>>> The related issues have been opened long ago, such > >>>> as HBASE-10295[3], HBASE-13773[4], etc. HBASE-15867 nearly solved the > >>>> problem as we have already abstract a replication peer storage > interface > >>>> and also a replication queue storage interface, the idea is to have > two > >>>> table based storages then we can solve the problem. But then we find > out > >>>> there is still a cyclic dependency which could fail the startup of a > >>>> cluster. In the current replication implementation, once we create a > new > >>>> WAL writer, we need to record it in the replication queue storage, > before > >>>> writing data to it. But if we move the replication queue storage to a > hbase > >>>> table, then we need this table to be writable first, then we can > record the > >>>> new WAL file in it. On a new cluster, this will hang the cluster > start up > >>>> as besides hbase:meta, no region can be online... > >>>> > >>>> In HBASE-27109, I proposed a new way to track the WAL files. Please > see > >>>> the design doc[5] for more details. You may find out that the > >>>> implementation of claim queues and replication log cleaner become more > >>>> complicated. This is a trade off, if we want to make the life when > writing > >>>> and tracking WAL easier, then we need to deal with the complexity in > other > >>>> places. But I think it is worthwhile as writing WAL is on the > critical path > >>>> of our main read/write flow, where claim queues and replication log > cleaner > >>>> are both background tasks. > >>>> > >>>> Feel free to reply here, on the jira issue, or on the design doc. > >>>> Suggestions are always welcomed. > >>>> > >>>> 1. https://issues.apache.org/jira/browse/HBASE-27109 > >>>> 2. https://issues.apache.org/jira/browse/HBASE-15867 > >>>> 3. https://issues.apache.org/jira/browse/HBASE-10295 > >>>> 4. https://issues.apache.org/jira/browse/HBASE-13773 > >>>> 5. > >>>> > https://docs.google.com/document/d/1QrSFlDQblxc12aTomE64sVmghrs_g5ys4fU9wGOdMHk/edit?usp=sharing > >>>> > >>> >
