ZhenyuLi created HBASE-28589:
--------------------------------
Summary: Client Does not Stop Retrying after DoNotRetryException
Key: HBASE-28589
URL: https://issues.apache.org/jira/browse/HBASE-28589
Project: HBase
Issue Type: Bug
Components: IPC/RPC
Affects Versions: 2.0.0, 1.5.0, 1.4.0, 1.3.0, 1.2.0
Reporter: ZhenyuLi
I recently discovered that the fix for HBase-14598 does not completely resolve
the issue. Their fix addressed two aspects: first, when the Scan/Get RPC
attempts to allocate a very large array that could potentially lead to an
out-of-memory (OOM) error, it will check the size of the array before
allocation and directly throw an exception to prevent the region server from
crashing and avoid possible cascading failures. Second, the developer intends
for the client to stop retrying after such a failure, as retrying will not
resolve the issue.
However, their fix involved throwing a DoNotRetryException. After
ByteBufferOutputStream.write throws the DoNotRetryException, in the call stack
(ByteBufferOutputStream.write --> encoder.write --> encodeCellsTo -->
his.cellBlockBuilder.buildCellBlockStream --> call.setResponse), the
DoNotRetryException is ultimately caught in the CallRunner.run function, with
only a log printed. Consequently, the DoNotRetryException is not sent back to
the client side. Instead, the client receives a generic exception for the
failed RPC request and continues retrying, which is not the desired behavior.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)