[ 
https://issues.apache.org/jira/browse/HBASE-27704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang updated HBASE-27704:
------------------------------
    Component/s: Quotas

> Quotas can drastically overflow configured limit
> ------------------------------------------------
>
>                 Key: HBASE-27704
>                 URL: https://issues.apache.org/jira/browse/HBASE-27704
>             Project: HBase
>          Issue Type: Bug
>          Components: Quotas
>            Reporter: Bryan Beaudreault
>            Assignee: Bryan Beaudreault
>            Priority: Major
>              Labels: patch-available
>             Fix For: 2.6.0, 3.0.0-alpha-4, 2.5.4, 2.4.18
>
>         Attachments: Screenshot 2023-03-10 at 5.17.51 PM.png
>
>
> The original implementation did not allow exceeding quota. For example, you 
> specify a limit of 10 resource/sec and consume 20 resources, it takes 1.1 
> seconds to be able submit another request. This was covered by the 
> [testOverconsumption in 
> TestRateLimiter|https://github.com/apache/hbase/blame/587b0b4f20bdc0415b6541023e611b69c87dba15/hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/TestRateLimiter.java#L97].
>  As an incidental part of HBASE-13686, that logic was changed. There is no 
> mention of the reasoning behind the change in the issue comments or review 
> board, I think it was missed. The goal of that issue was to add different 
> refill strategies, but it also modified the over consumption. The 
> testOverconsumption was [split out for both refill 
> strategies|https://github.com/apache/hbase/blame/master/hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/TestRateLimiter.java#L104-L159],
>  but the core reasoning was lost. The comment says:
> {code:java}
> // 10 resources are available, but we need to consume 20 resources109    
> // Verify that we have to wait at least 1.1sec to have 1 resource available 
> {code}
> But the actual test was updated to only require a new resource after 100ms. 
> This is incorrect. 
> The problem is, when consuming if you go negative it sets to 0 
> [here|https://github.com/apache/hbase/blame/master/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/RateLimiter.java#L187-L191].
>  Additionally, when refilling the new logic does a Math.max(0, available + 
> refillAmount): 
> [here|https://github.com/apache/hbase/blame/master/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/RateLimiter.java#L159-L163].
>  So it's really impossible to get below 0, which is impractical for a rate 
> limiter. 
> With this setup it's very easy to drastically overconsume the rate limiter. 
> See attached screenshot, which shows two humps. The first one has the current 
> logic, the second hump has my fix which removes both of those problems. The 
> rate limit was set to 500mb/s, but I was easily able to go over 700 mb/s 
> without the fix.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to