[
https://issues.apache.org/jira/browse/HBASE-29679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18033331#comment-18033331
]
Charles Connell edited comment on HBASE-29679 at 10/27/25 9:20 PM:
-------------------------------------------------------------------
!Screenshot 2025-10-27 at 16.32.52.png|thumbnail!
This chart shows the effects of this patch on the garbage collection rate in
one of my company's HBase clusters, which serves a lot of
RpcThrottlingExceptions.
was (Author: charlesconnell):
!Screenshot 2025-10-27 at 16.32.52.png!
This chart shows the effects of this patch on the garbage collection rate in
one of my company's HBase clusters, which serves a lot of
RpcThrottlingExceptions.
> Suppress stack trace in RpcThrottlingException
> ----------------------------------------------
>
> Key: HBASE-29679
> URL: https://issues.apache.org/jira/browse/HBASE-29679
> Project: HBase
> Issue Type: Improvement
> Components: Quotas
> Reporter: Charles Connell
> Assignee: Charles Connell
> Priority: Minor
> Labels: pull-request-available
> Attachments: Screenshot 2025-10-27 at 16.32.52.png,
> failRegionAction.alloc.html, failRegionAction.cpu.html,
> failRegionAction.wall.html
>
>
> When under heavy load, a RegionServer may need to serve a very large number
> of {{RpcThrottlingExceptions}} per second. Ideally, these should be cheap to
> send, because they are HBase's load shedding mechanism.
> At my company, we sometimes see that sending many {{RpcThrottlingExceptions}}
> isn't always easy. The most expensive part is generating the exception's
> stack trace, and then serializing that over the wire. This is not necessary,
> so it can be skipped to save a lot of work.
> I'm attaching a CPU-time profile, wall-clock-time profile, and allocation
> profile, showing the problem in action. In the allocation profile,
> {{StringUtils.stringifyException}} is responsible for 26% of allocations. In
> the CPU-time profile, {{StringUtils.stringifyException}} plus
> {{RpcThrottlingException.<init>}} is directly responsible for 4% of CPU time,
> and indirectly responsible for more time spent garbage collecting later.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)