[
https://issues.apache.org/jira/browse/HBASE-16345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15584396#comment-15584396
]
Esteban Gutierrez commented on HBASE-16345:
-------------------------------------------
Thanks [~huaxiang] can you want to resubmit your patch to see if the OOME
persists?
> RpcRetryingCallerWithReadReplicas#call() should catch some RegionServer
> Exceptions
> ----------------------------------------------------------------------------------
>
> Key: HBASE-16345
> URL: https://issues.apache.org/jira/browse/HBASE-16345
> Project: HBase
> Issue Type: Bug
> Components: Client
> Affects Versions: 2.0.0
> Reporter: huaxiang sun
> Assignee: huaxiang sun
> Attachments: HBASE-16345-v001.patch, HBASE-16345.branch-1.001.patch,
> HBASE-16345.master.001.patch, HBASE-16345.master.002.patch,
> HBASE-16345.master.003.patch, HBASE-16345.master.004.patch,
> HBASE-16345.master.005.patch, HBASE-16345.master.005.patch
>
>
> Update for the description. Debugged more at this front based on the comments
> from Enis.
> The cause is that for the primary replica, if its retry is exhausted too
> fast, f.get() [1] returns ExecutionException. This Exception needs to be
> ignored and continue with the replicas.
> The other issue is that after adding calls for the replicas, if the first
> completed task gets ExecutionException (due to the retry exhausted), it
> throws the exception to the client[2].
> In this case, it needs to loop through these tasks, waiting for the success
> one. If no one succeeds, throw exception.
> Similar for the scan as well
> [1]
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L197
> [2]
> https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerWithReadReplicas.java#L219
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)