ZhenyuLi created HBASE-28589:
--------------------------------

             Summary: Client Does not Stop Retrying after DoNotRetryException
                 Key: HBASE-28589
                 URL: https://issues.apache.org/jira/browse/HBASE-28589
             Project: HBase
          Issue Type: Bug
          Components: IPC/RPC
    Affects Versions: 2.0.0, 1.5.0, 1.4.0, 1.3.0, 1.2.0
            Reporter: ZhenyuLi


I recently discovered that the fix for HBase-14598 does not completely resolve 
the issue. Their fix addressed two aspects: first, when the Scan/Get RPC 
attempts to allocate a very large array that could potentially lead to an 
out-of-memory (OOM) error, it will check the size of the array before 
allocation and directly throw an exception to prevent the region server from 
crashing and avoid possible cascading failures. Second, the developer intends 
for the client to stop retrying after such a failure, as retrying will not 
resolve the issue.

However, their fix involved throwing a DoNotRetryException. After 
ByteBufferOutputStream.write throws the DoNotRetryException, in the call stack 
(ByteBufferOutputStream.write --> encoder.write --> encodeCellsTo --> 
his.cellBlockBuilder.buildCellBlockStream --> call.setResponse), the 
DoNotRetryException is ultimately caught in the CallRunner.run function, with 
only a log printed. Consequently, the DoNotRetryException is not sent back to 
the client side. Instead, the client receives a generic exception for the 
failed RPC request and continues retrying, which is not the desired behavior.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to