[
https://issues.apache.org/jira/browse/HBASE-27704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duo Zhang updated HBASE-27704:
------------------------------
Component/s: Quotas
> Quotas can drastically overflow configured limit
> ------------------------------------------------
>
> Key: HBASE-27704
> URL: https://issues.apache.org/jira/browse/HBASE-27704
> Project: HBase
> Issue Type: Bug
> Components: Quotas
> Reporter: Bryan Beaudreault
> Assignee: Bryan Beaudreault
> Priority: Major
> Labels: patch-available
> Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.4, 2.4.18
>
> Attachments: Screenshot 2023-03-10 at 5.17.51 PM.png
>
>
> The original implementation did not allow exceeding quota. For example, you
> specify a limit of 10 resource/sec and consume 20 resources, it takes 1.1
> seconds to be able submit another request. This was covered by the
> [testOverconsumption in
> TestRateLimiter|https://github.com/apache/hbase/blame/587b0b4f20bdc0415b6541023e611b69c87dba15/hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/TestRateLimiter.java#L97].
> As an incidental part of HBASE-13686, that logic was changed. There is no
> mention of the reasoning behind the change in the issue comments or review
> board, I think it was missed. The goal of that issue was to add different
> refill strategies, but it also modified the over consumption. The
> testOverconsumption was [split out for both refill
> strategies|https://github.com/apache/hbase/blame/master/hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/TestRateLimiter.java#L104-L159],
> but the core reasoning was lost. The comment says:
> {code:java}
> // 10 resources are available, but we need to consume 20 resources109
> // Verify that we have to wait at least 1.1sec to have 1 resource available
> {code}
> But the actual test was updated to only require a new resource after 100ms.
> This is incorrect.
> The problem is, when consuming if you go negative it sets to 0
> [here|https://github.com/apache/hbase/blame/master/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/RateLimiter.java#L187-L191].
> Additionally, when refilling the new logic does a Math.max(0, available +
> refillAmount):
> [here|https://github.com/apache/hbase/blame/master/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/RateLimiter.java#L159-L163].
> So it's really impossible to get below 0, which is impractical for a rate
> limiter.
> With this setup it's very easy to drastically overconsume the rate limiter.
> See attached screenshot, which shows two humps. The first one has the current
> logic, the second hump has my fix which removes both of those problems. The
> rate limit was set to 500mb/s, but I was easily able to go over 700 mb/s
> without the fix.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)