Apache9 commented on code in PR #5203: URL: https://github.com/apache/hbase/pull/5203#discussion_r1178051368
########## src/main/asciidoc/_chapters/ops_mgt.adoc: ########## @@ -2521,93 +2517,38 @@ NOTE: WALs are saved when replication is enabled or disabled as long as peers ex [[rs.failover.details]] ==== Region Server Failover -When no region servers are failing, keeping track of the logs in ZooKeeper adds no value. -Unfortunately, region servers do fail, and since ZooKeeper is highly available, it is useful for managing the transfer of the queues in the event of a failure. +When no region servers are failing, keeping track of the logs in hbase:replication table adds no value. +However, in case of region server failure, we will manage the transfer of the queues based on hbase:replication. -Each of the master cluster region servers keeps a watcher on every other region server, in order to be notified when one dies (just as the master does). When a failure happens, they all race to create a znode called `lock` inside the dead region server's znode that contains its queues. -The region server that creates it successfully then transfers all the queues to its own znode, one at a time since ZooKeeper does not support renaming queues. -After queues are all transferred, they are deleted from the old location. -The znodes that were recovered are renamed with the ID of the slave cluster appended with the name of the dead server. +When a region server fails, the HMaster of master cluster will trigger the SCP, and all replication queues on the failed region server will be claimed in the SCP. Review Comment: OK, I think the current ref guide also need to be updated... I'm not sure which is the accruate version, but starting from a minor release for 2.x, we do not watch on the region server's znode any more, we use SCP to schedule claim queue operations... ########## src/main/asciidoc/_chapters/ops_mgt.adoc: ########## @@ -2521,93 +2517,38 @@ NOTE: WALs are saved when replication is enabled or disabled as long as peers ex [[rs.failover.details]] ==== Region Server Failover -When no region servers are failing, keeping track of the logs in ZooKeeper adds no value. -Unfortunately, region servers do fail, and since ZooKeeper is highly available, it is useful for managing the transfer of the queues in the event of a failure. +When no region servers are failing, keeping track of the logs in hbase:replication table adds no value. +However, in case of region server failure, we will manage the transfer of the queues based on hbase:replication. -Each of the master cluster region servers keeps a watcher on every other region server, in order to be notified when one dies (just as the master does). When a failure happens, they all race to create a znode called `lock` inside the dead region server's znode that contains its queues. -The region server that creates it successfully then transfers all the queues to its own znode, one at a time since ZooKeeper does not support renaming queues. -After queues are all transferred, they are deleted from the old location. -The znodes that were recovered are renamed with the ID of the slave cluster appended with the name of the dead server. +When a region server fails, the HMaster of master cluster will trigger the SCP, and all replication queues on the failed region server will be claimed in the SCP. +The claim queue operation is just to remove the row of a replication queue, and insert a new row, where we change the server name to the region server which claims the queue. Review Comment: We need to mention that we use multi row mutate endpoint here, so the data for a single peer must be in the same region. ########## src/main/asciidoc/_chapters/ops_mgt.adoc: ########## @@ -2475,14 +2471,14 @@ When nodes are removed from the slave cluster, or if nodes go down or come back ==== Keeping Track of Logs -Each master cluster region server has its own znode in the replication znodes hierarchy. -It contains one znode per peer cluster (if 5 slave clusters, 5 znodes are created), and each of these contain a queue of WALs to process. +Each master cluster region server has its queue state in the hbase:replication table. +It contains one row per peer cluster (if 5 slave clusters, 5 rows are created), and each of these contain a queue of WALs to process. Review Comment: Here things are a bit different. For zookeeper, it is like a tree, we have a znode for a peer cluster, but under the znode we have lots of files. But for table based implementation, we have server name in row key, which means we will have lots of rows for a given peer... ########## src/main/asciidoc/_chapters/ops_mgt.adoc: ########## @@ -2433,26 +2433,22 @@ Replication State Storage:: `ReplicationPeerStorage` and `ReplicationQueueStorage`. The former one is for storing the replication peer related states, and the latter one is for storing the replication queue related states. - HBASE-15867 is only half done, as although we have abstract these two interfaces, we still only -have zookeeper based implementations. + And in HBASE-27109, we have implemented the `ReplicationQueueStorage` interface to store the replication queue in the hbase:replication table. Replication State in ZooKeeper:: By default, the state is contained in the base node _/hbase/replication_. - Usually this nodes contains two child nodes, the `peers` znode is for storing replication peer -state, and the `rs` znodes is for storing replication queue state. + Currently, this nodes contains only one child node, namely `peers` znode, which is used for storing replication peer state. Review Comment: As now we only have one ref guide on master branch, we should include the doducmentation for all branches. So here I think we should mention that, after 3.0.0, it only contains one child node, but before 3.0.0, we still use zk to store queue data. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
