[ 
https://issues.apache.org/jira/browse/HBASE-19925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351392#comment-16351392
 ] 

Duo Zhang commented on HBASE-19925:
-----------------------------------

And mind adding a test for this? Now we only test the scenario where we even do 
not have the correct zk so it will be stuck in the first loop for creating 
ReplicationEndpoint.

Oh, the first loop have the same problem... It will just exit the loop if 
source is inactive... Maybe we should also change it to a 'for(;;)' loop, and 
if we find that the source is inactive, we just return, instead of break.

Thanks.

> Delete an unreachable peer will triggers all regionservers abort
> ----------------------------------------------------------------
>
>                 Key: HBASE-19925
>                 URL: https://issues.apache.org/jira/browse/HBASE-19925
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Yun Zhao
>            Assignee: Yun Zhao
>            Priority: Critical
>         Attachments: HBASE-19925.master.001.patch
>
>
> Add an unreachable peer
> {code:java}
> add_peer '4', CLUSTER_KEY => "server1.cie.com:2181:/hbase"{code}
> After a while to delete it,Regionserver will appear in the following log and 
> stop.
> {code:java}
> 2018-02-02 20:04:25,959 INFO [main-EventThread.replicationSource,4] 
> regionserver.ReplicationSource: Replicating 
> 5467de52-dc46-45be-902c-110dd7a83e06 -> null
> 2018-02-02 20:04:25,960 ERROR 
> [main-EventThread.replicationSource,4.replicationSource.xxxx.com%2C16020%2C1515498473547.default,4]
>  regionserver.ReplicationSource: Unexpected exception in 
> ReplicationSourceWorkerThread, currentPath=null
> java.lang.IllegalArgumentException: Peer with id= 4 is not connected
>  at 
> org.apache.hadoop.hbase.replication.ReplicationPeersZKImpl.getStatusOfPeer(ReplicationPeersZKImpl.java:207)
>  at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.isPeerEnabled(ReplicationSource.java:327)
>  at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource$ReplicationSourceWorkerThread.run(ReplicationSource.java:512)
> 2018-02-02 20:04:25,960 INFO 
> [main-EventThread.replicationSource,4.replicationSource.xxxx.com%2C16020%2C1515498473547.default,4]
>  regionserver.HRegionServer: STOPPED: Unexpected exception in 
> ReplicationSourceWorkerThread{code}
>  
> HBase 1.2.6



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to