[
https://issues.apache.org/jira/browse/HBASE-7632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558517#comment-13558517
]
Feng Honghua commented on HBASE-7632:
-------------------------------------
The root cause is when client/shell add_peer, it first creates the peerNode,
and then creates the peerStateNode; but RS receives the peerNode change event
and begins to addPeer accordingly, this occurs after client/shell creates
peerNode but before creating peerStateNode; RS's connectToPeer will listen on
peerStateNode, which will create the peerStateNode if find the peerStateNode
not exists. since client/shell and RS are two different process, RS's "check
and create" can succeed in check phase, but may fail in create phase due to
client/shell is done creating peerStateNode. no exclusive mechanism can be used
to make RS' "check and create" atomic.
Such bug can be avoided if we merge peerState info into the peerNode, when RS
addPeer for a newly added peerNode, the peerState info can be read from the
peerNode, no above "check and create" problem.
> fail to create ReplicationSource if ReplicationPeer.startStateTracker
> checkExists(peerStateNode) and find not exist but fails in createAndWatch due
> to client/shell is done creating it then, now throws exception and results in
> addPeer fail
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-7632
> URL: https://issues.apache.org/jira/browse/HBASE-7632
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Affects Versions: 0.94.2, 0.94.3, 0.94.4
> Reporter: Feng Honghua
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> fail to create ReplicationSource if ReplicationPeer.startStateTracker
> checkExists(peerStateNode) and find not exist but fails in createAndWatch due
> to client/shell is done creating it then, now throws exception and results in
> addPeer fail
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira