Bryan Beaudreault created HBASE-27704:
-----------------------------------------

             Summary: Quotas can drastically overflow configured limit
                 Key: HBASE-27704
                 URL: https://issues.apache.org/jira/browse/HBASE-27704
             Project: HBase
          Issue Type: Bug
            Reporter: Bryan Beaudreault
         Attachments: Screenshot 2023-03-10 at 5.17.51 PM.png

The original implementation did not allow exceeding quota. For example, you 
specify a limit of 10 resource/sec and submit 20 resources, it takes 1.1 
seconds to be able submit another request. This was covered by the 
[testOverconsumption in 
TestRateLimiter|https://github.com/apache/hbase/blame/587b0b4f20bdc0415b6541023e611b69c87dba15/hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/TestRateLimiter.java#L97].
 As an incidental part of HBASE-13686, that logic was changed. There is no 
mention of the reasoning behind the change in the issue comments or review 
board, I think it was missed. The goal of that issue was to add different 
refill strategies, but it also modified the over consumption. The 
testOverconsumption was [split out for both refill 
strategies|https://github.com/apache/hbase/blame/master/hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/TestRateLimiter.java#L104-L159],
 but the core reasoning was lost. The comment says:
{code:java}
// 10 resources are available, but we need to consume 20 resources109    
// Verify that we have to wait at least 1.1sec to have 1 resource available 
{code}
But the actual test was updated to only require a new resource after 100ms. 
This is incorrect. 

The problem is, when consuming if you go negative it sets to 0 
[here|https://github.com/apache/hbase/blame/master/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/RateLimiter.java#L187-L191].
 Additionally, when refilling the new logic does a Math.max(0, refillAmount): 
[here|https://github.com/apache/hbase/blame/master/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/RateLimiter.java#L159-L163].
 So it's really impossible to get below 0, which is impractical for a rate 
limiter. 

With this setup it's very easy to drastically overconsume the rate limiter. See 
attached screenshot, which shows two humps. The first one has the current 
logic, the second hump has my fix which removes both of those problems. The 
rate limit was set to 500mb/s, but I was easily able to go over 700 mb/s 
without the fix.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to