cuijianwei created HBASE-12769:
----------------------------------

             Summary: Replication fails to delete all corresponding zk nodes 
when peer is removed
                 Key: HBASE-12769
                 URL: https://issues.apache.org/jira/browse/HBASE-12769
             Project: HBase
          Issue Type: Improvement
          Components: Replication
    Affects Versions: 0.99.2
            Reporter: cuijianwei
            Priority: Minor


When removing a peer, the client side will delete peerId under peersZNode node; 
then alive region servers will be notified and delete corresponding hlog queues 
under its rsZNode of replication. However, if there are failed servers whose 
hlog queues have not been transferred by alive servers(this likely happens if 
setting a big value to "replication.sleep.before.failover" and lots of region 
servers restarted), these hlog queues won't be deleted after the peer is 
removed. I think remove_peer should guarantee all corresponding zk nodes have 
been removed after it completes; otherwise, if we create a new peer with the 
same peerId with the removed one, there might be unexpected data to be 
replicated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to