[
https://issues.apache.org/jira/browse/HBASE-28589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ZhenyuLi updated HBASE-28589:
-----------------------------
External issue ID: (was: HBase-14598)
> Client Does not Stop Retrying after DoNotRetryException
> -------------------------------------------------------
>
> Key: HBASE-28589
> URL: https://issues.apache.org/jira/browse/HBASE-28589
> Project: HBase
> Issue Type: Bug
> Components: IPC/RPC
> Affects Versions: 1.2.0, 1.3.0, 1.4.0, 1.5.0, 2.0.0
> Reporter: ZhenyuLi
> Priority: Minor
>
> I recently discovered that the fix for HBase-14598 does not completely
> resolve the issue. Their fix addressed two aspects: first, when the Scan/Get
> RPC attempts to allocate a very large array that could potentially lead to an
> out-of-memory (OOM) error, it will check the size of the array before
> allocation and directly throw an exception to prevent the region server from
> crashing and avoid possible cascading failures. Second, the developer intends
> for the client to stop retrying after such a failure, as retrying will not
> resolve the issue.
> However, their fix involved throwing a DoNotRetryException. After
> ByteBufferOutputStream.write throws the DoNotRetryException, in the call
> stack (ByteBufferOutputStream.write --> encoder.write --> encodeCellsTo -->
> his.cellBlockBuilder.buildCellBlockStream --> call.setResponse), the
> DoNotRetryException is ultimately caught in the CallRunner.run function, with
> only a log printed. Consequently, the DoNotRetryException is not sent back to
> the client side. Instead, the client receives a generic exception for the
> failed RPC request and continues retrying, which is not the desired behavior.
> After looking into the code of CallRunner, it is obvious that the
> DoNotRetryException in call.setResponse will be swallowed in the error
> handler with just a LOG printed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)