[jira] [Commented] (HBASE-13618) ReplicationSource is too eager to remove sinks

Lars Hofhansl (JIRA) Tue, 05 May 2015 21:39:57 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-13618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529894#comment-14529894
 ]


Lars Hofhansl commented on HBASE-13618:
---------------------------------------

Comments? Concerns?

The issue that I am trying to fix is for a long running region server. If over 
(say) a month we successfully replicated 100000's of batches across but just 
three batches fail due to random temporary glitches (maybe we rolling restarted 
the target cluster a few times), we'll still remove the sink.


> ReplicationSource is too eager to remove sinks
> ----------------------------------------------
>
>                 Key: HBASE-13618
>                 URL: https://issues.apache.org/jira/browse/HBASE-13618
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>         Attachments: 13618.txt
>
>
> Looking at the replication for some other reason I noticed that the 
> replication source might be a bit too eager to remove sinks from the list of 
> valid sinks.
> The current logic allows a sink to fail N times (default 3) and then it will 
> be remove from the sinks. But note that this failure count is never reduced, 
> so given enough runtime and some network glitches _every_ sink will 
> eventually be removed. When all sink are removed the source pick new sinks 
> and the counter is set to 0 for all of them.
> I think we should change to reset the counter each time we successfully 
> replicate something to the sink (which proves the sink isn't dead). Or we 
> could decrease the counter each time we successfully replication, that might 
> be better - if we consistently fail more attempts than we succeed the sink 
> should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13618) ReplicationSource is too eager to remove sinks

Reply via email to