[ 
https://issues.apache.org/jira/browse/HBASE-29265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17946446#comment-17946446
 ] 

Hernan Gelaf-Romer commented on HBASE-29265:
--------------------------------------------

I need to amend this, I no longer think that 
RetriesExhaustedWithDeailtsExceptions can lead to a meta cache clearing 
exception. I traced the code path a little deeper and realized it's likely 
something else. I think that SocketTimeoutExceptions can manifest to the 
client, even if we could throw an OperationTimeoutException. I think we're 
encountering the race condition explained by the comment here: 
https://github.com/apache/hbase/blob/a8ff965536fda48bbb6d1f77b53a55e43b8d9461/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestClientOperationTimeout.java#L193

> RetriesExhaustedWithDetailsException can create a pathological feedback loop 
> with multigets
> -------------------------------------------------------------------------------------------
>
>                 Key: HBASE-29265
>                 URL: https://issues.apache.org/jira/browse/HBASE-29265
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Hernan Gelaf-Romer
>            Assignee: Hernan Gelaf-Romer
>            Priority: Major
>
> Similar to https://issues.apache.org/jira/browse/HBASE-27487
>  
> RetriesExhaustedWithDetailsException currently obscures that the underlying 
> exception(s) may be OperationTimeoutExceededException. Because of this, we 
> can still run into situations where slow request can trigger a flood of meta 
> cache clearing exceptions, and hotspot the meta table. 
>  
> We should update our exception handling logic to special case these 
> exceptions, and explicitly check to see if the underlying root cause for the 
> request failures was due to an operation timeout. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to