[ 
https://issues.apache.org/jira/browse/HBASE-16096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15512492#comment-15512492
 ] 

Appy commented on HBASE-16096:
------------------------------

This new test possibly made TestReplicationSourceManagerZkImpl flaky. See [this 
failure|https://builds.apache.org/job/HBASE-Flaky-Tests/3399/testReport/junit/org.apache.hadoop.hbase.replication.regionserver/TestReplicationSourceManagerZkImpl/testLogRoll/].
Am i interpreting it right that this testcase leaves the fake peer around which 
then the next testcase, testLogRoll, tries to connect to.
[~ashu210890], [~Vegetable26]

cc. [~jmhsieh]


> Replication keeps accumulating znodes
> -------------------------------------
>
>                 Key: HBASE-16096
>                 URL: https://issues.apache.org/jira/browse/HBASE-16096
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 2.0.0, 1.2.0, 1.3.0
>            Reporter: Ashu Pachauri
>            Assignee: Joseph
>             Fix For: 2.0.0, 1.3.0, 1.4.0
>
>         Attachments: HBASE-16096-branch-1.patch, HBASE-16096.patch
>
>
> If there is an error while creating the replication source on adding the 
> peer, the source if not added to the in memory list of sources but the 
> replication peer is. 
> However, in such a scenario, when you remove the peer, it is deleted from 
> zookeeper successfully but for removing the in memory list of peers, we wait 
> for the corresponding sources to get deleted (which as we said don't exist 
> because of error creating the source). 
> The problem here is the ordering of operations for adding/removing source and 
> peer. 
> Modifying the code to always remove queues from the underlying storage, even 
> if there exists no sources also requires a small refactoring of 
> TableBasedReplicationQueuesImpl to not abort on removeQueues() of an empty 
> queue



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to