[
https://issues.apache.org/jira/browse/HBASE-14937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062484#comment-15062484
]
Ted Yu commented on HBASE-14937:
--------------------------------
{code}
300 this.callTimeout *= callTimeoutRetryCounter * 2;
{code}
Would the timeout increase too fast after several retries ?
{code}
303 LOG.debug("Replication RPC request call timeout " +
this.callTimeout
304 + " overflows integer value. Setting it to interger
max value.");
{code}
Please include retry count in above message.
If we continuously get CallTimeoutException, retry would be performed
repeatedly. Should an upperbound be set for the total duration of retries ?
> Make rpc call timeout for replication adaptive
> ----------------------------------------------
>
> Key: HBASE-14937
> URL: https://issues.apache.org/jira/browse/HBASE-14937
> Project: HBase
> Issue Type: Improvement
> Reporter: Ashish Singhi
> Assignee: Ashish Singhi
> Labels: replication
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-14937.patch
>
>
> When peer cluster replication is disabled and lot of writes are happening in
> active cluster and later on peer cluster replication is enabled then there
> are chances that replication requests to peer cluster may time out.
> This is possible after HBASE-13153 and it can also happen with many and many
> WAL data replication still pending to replicate.
> Approach to this problem will be discussed in the comments.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)