[ 
https://issues.apache.org/jira/browse/HBASE-16336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15724132#comment-15724132
 ] 

Guanghao Zhang commented on HBASE-16336:
----------------------------------------

HBASE-12769 try to fix this by hbck. A more automatic way is to add a 
replication zk node checker on master. It periodically check and delete the 
useless replication zk node. In our use case, we found there are dead rs znode 
leaved  and the dead rs znode only can be transferred when other rs restarted. 
So the replication zk node checker should check the dead rs znode too. I know 
the more proper solution is  HBASE-11392 and HBASE-12439. But for branch-1, we 
can resolve this by a replication zk node checker. Any ideas? [~enis] 

> Removing peers seem to be leaving spare queues
> ----------------------------------------------
>
>                 Key: HBASE-16336
>                 URL: https://issues.apache.org/jira/browse/HBASE-16336
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Replication
>            Reporter: Joseph
>
> I have been running IntegrationTestReplication repeatedly with the backported 
> Replication Table changes. Every other iteration of the test fails with, but 
> these queues should have been deleted when we removed the peers. I believe 
> this may be related to HBASE-16096, HBASE-16208, or HBASE-16081.
> 16/08/02 08:36:07 ERROR util.AbstractHBaseTool: Error running command-line 
> tool
> org.apache.hadoop.hbase.replication.ReplicationException: undeleted queue for 
> peerId: TestPeer, replicator: 
> hbase4124.ash2.facebook.com,16020,1470150251042, queueId: TestPeer
>       at 
> org.apache.hadoop.hbase.replication.ReplicationPeersZKImpl.checkQueuesDeleted(ReplicationPeersZKImpl.java:544)
>       at 
> org.apache.hadoop.hbase.replication.ReplicationPeersZKImpl.addPeer(ReplicationPeersZKImpl.java:127)
>       at 
> org.apache.hadoop.hbase.client.replication.ReplicationAdmin.addPeer(ReplicationAdmin.java:200)
>       at 
> org.apache.hadoop.hbase.test.IntegrationTestReplication$VerifyReplicationLoop.setupTablesAndReplication(IntegrationTestReplication.java:239)
>       at 
> org.apache.hadoop.hbase.test.IntegrationTestReplication$VerifyReplicationLoop.run(IntegrationTestReplication.java:325)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at 
> org.apache.hadoop.hbase.test.IntegrationTestReplication.runTestFromCommandLine(IntegrationTestReplication.java:418)
>       at 
> org.apache.hadoop.hbase.IntegrationTestBase.doWork(IntegrationTestBase.java:134)
>       at 
> org.apache.hadoop.hbase.util.AbstractHBaseTool.run(AbstractHBaseTool.java:112)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at 
> org.apache.hadoop.hbase.test.IntegrationTestReplication.main(IntegrationTestReplication.java:424)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to