[
https://issues.apache.org/jira/browse/HBASE-12386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14190960#comment-14190960
]
Lars Hofhansl commented on HBASE-12386:
---------------------------------------
{{"Current list of sinks is out of date or empty, updating"}} seems clear
enough to me.
+1 on patch.
One thing we have to think through is what happens when the slave cluster is
down for a bit. We'd chose sinks again on each call. I think that's OK
especially since we dialed down the retry interval to 5mins recently after a
bit.
Also, we can still be a bad situation where RegionServers die and restart at
the slave cluster, we could go down to a single RS at the peers before we try
to choose sinks again. That's for another issue.
> Replication gets stuck following a transient zookeeper error to remote peer
> cluster
> -----------------------------------------------------------------------------------
>
> Key: HBASE-12386
> URL: https://issues.apache.org/jira/browse/HBASE-12386
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Affects Versions: 0.98.7
> Reporter: Adrian Muraru
> Attachments: HBASE-12386.patch
>
>
> Following a transient ZK error replication gets stuck and remote peers are
> never updated.
> Source region servers are reporting continuously the following error in logs:
> "No replication sinks are available"
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)