cuijianwei created HBASE-12769:
----------------------------------
Summary: Replication fails to delete all corresponding zk nodes
when peer is removed
Key: HBASE-12769
URL: https://issues.apache.org/jira/browse/HBASE-12769
Project: HBase
Issue Type: Improvement
Components: Replication
Affects Versions: 0.99.2
Reporter: cuijianwei
Priority: Minor
When removing a peer, the client side will delete peerId under peersZNode node;
then alive region servers will be notified and delete corresponding hlog queues
under its rsZNode of replication. However, if there are failed servers whose
hlog queues have not been transferred by alive servers(this likely happens if
setting a big value to "replication.sleep.before.failover" and lots of region
servers restarted), these hlog queues won't be deleted after the peer is
removed. I think remove_peer should guarantee all corresponding zk nodes have
been removed after it completes; otherwise, if we create a new peer with the
same peerId with the removed one, there might be unexpected data to be
replicated.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)