[
https://issues.apache.org/jira/browse/HBASE-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772582#comment-13772582
]
Lars Hofhansl commented on HBASE-9591:
--------------------------------------
Is this 0.96+ only, or a 0.94 issue as well?
> [replication] getting "Current list of sinks is out of date" all the time
> when a source is recovered
> ----------------------------------------------------------------------------------------------------
>
> Key: HBASE-9591
> URL: https://issues.apache.org/jira/browse/HBASE-9591
> Project: HBase
> Issue Type: Bug
> Affects Versions: 0.96.0
> Reporter: Jean-Daniel Cryans
> Priority: Minor
> Fix For: 0.96.1
>
>
> I tried killing a region server when the slave cluster was down, from that
> point on my log was filled with:
> {noformat}
> 2013-09-20 00:31:03,942 INFO [regionserver60020.replicationSource,1]
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSinkManager:
> Current list of sinks is out of date, updating
> 2013-09-20 00:31:04,226 INFO
> [ReplicationExecutor-0.replicationSource,1-jdec2hbase0403-4,60020,1379636329634]
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSinkManager:
> Current list of sinks is out of date, updating
> {noformat}
> The first log line is from the normal source, the second is the recovered
> one. When we try to replicate, we call
> replicationSinkMgr.getReplicationSink() and if the list of machines was
> refreshed since the last time then we call chooseSinks() which in turn
> refreshes the list of sinks and resets our lastUpdateToPeers. The next source
> will notice the change, and will call chooseSinks() too. The first source is
> coming for another round, sees the list was refreshed, calls chooseSinks()
> again. It happens forever until the recovered queue is gone.
> We could have all the sources going to the same cluster share a thread-safe
> ReplicationSinkManager. We could also manage the same cluster separately for
> each source. Or even easier, if the list we get in chooseSinks() is the same
> we had before, consider it a noop.
> What do you think [~gabriel.reid]?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira