[ 
https://issues.apache.org/jira/browse/HBASE-14937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15064193#comment-15064193
 ] 

Andrew Purtell commented on HBASE-14937:
----------------------------------------

When replication is down, say because of a network partition or temporary issue 
on one cluster, RPC calls can of course time out. Once the network or cluster 
is back in operation we want replication activity to resume as quickly as 
possible. Does this change prevent timely restart of replication activity? 
Won't we potentially be waiting for a long time for the current call to timeout 
before probing with another? Would the time we might wait unnecessarily 
increase as the duration of the outage increases, making a long outage a really 
really long outage?

> Make rpc call timeout for replication adaptive
> ----------------------------------------------
>
>                 Key: HBASE-14937
>                 URL: https://issues.apache.org/jira/browse/HBASE-14937
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Ashish Singhi
>            Assignee: Ashish Singhi
>              Labels: replication
>             Fix For: 2.0.0, 1.3.0
>
>         Attachments: HBASE-14937.patch
>
>
> When peer cluster replication is disabled and lot of writes are happening in 
> active cluster and later on peer cluster replication is enabled then there 
> are chances that replication requests to peer cluster may time out.
> This is possible after HBASE-13153 and it can also happen with many and many 
> WAL data replication still pending to replicate.
> Approach to this problem will be discussed in the comments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to