[GitHub] [kafka] gharris1727 commented on pull request #12806: KAFKA-14345: Fix flakiness with more accurate bound in (Dynamic)ConnectionQuotaTest

via GitHub Mon, 07 Aug 2023 12:13:19 -0700


gharris1727 commented on PR #12806:
URL: https://github.com/apache/kafka/pull/12806#issuecomment-1668440375


   Hey @divijvaidya .
   
   > aren't we then accepting the current behaviour of quotas implementation as 
expected behaviour i.e. the deviation could be +- epsilon and epsilon could 
exceed the thresholds (max.connection.creation.rate) set for a windowing period.
   
   It is not possible to eliminate the deviation visible to the outside 
observer, and these tests will always need to include an error term. Here's 
what the error bounds will be for the default window configuration and various 
observation times with the variable-width and fixed-width windowing algorithms:
   ```
   millis  variable-width          fixed-width
   11001        2.4444444444444446      2.2
   20001        1.736842105263158       2.2
   22001        1.736842105263158       1.5714285714285714
   30001        1.5172413793103448      1.5714285714285714
   33001        1.5172413793103448      1.375
   40001        1.4102564102564104      1.375
   44001        1.4102564102564104      1.2790697674418605
   50001        1.346938775510204       1.2790697674418605
   55001        1.346938775510204       1.2222222222222223
   60001        1.305084745762712       1.2222222222222223
   66001        1.305084745762712       1.1846153846153846
   70001        1.2753623188405796      1.1846153846153846
   77001        1.2753623188405796      1.1578947368421053
   80001        1.2531645569620253      1.1578947368421053
   88001        1.2531645569620253      1.1379310344827587
   90001        1.2359550561797752      1.1379310344827587
   99001        1.2359550561797752      1.1224489795918366
   ```
   
   This PR implements the variable-width limits because the variable-width 
algorithm is currently on trunk, not because it is more accepted or correct. 
I'm personally ambivalent about which algorithm should be used. This PR 
replaces the un-motivated and incorrect hardcoded constants with computed 
bounds, and that is beneficial with or without the fixed-width algorithm.
   
   > Alternatively, the fix i.e. https://github.com/apache/kafka/pull/12045 
doesn't require a KIP (it doesn't change any public interfaces). Please feel 
free to pick up that PR (or duplicate it). I won't have time any time soon to 
work on it. I will be happy to provide a review.
   
   If you don't have time or interest to work that PR and/or KIP, then I think 
a test-only fix is appropriate until someone is interested in picking up the 
PR.  The flaky tests have already made us aware of the odd behavior of the 
variable-width algorithm, so more ongoing failures aren't helpful.
   
   > Having said that, the answer might be that the current behaviour is 
accepted behaviour. In which case, I would be comfortable with this change if 
it is accompanied by a change in docs explaining the current expectations from 
the windowing algorithm so that the users at least know what their expectation 
from the quota implementation should be.
   
   I don't think that what this PR addresses needs to be explained in the 
documentation.
   
   * The existing quotas documentation is very general, and focuses on the 
strategic usage of quotas. This extremely specific error bound would be out of 
place among the other documentation.
   * How users choose their rate limit depends on so many more important 
factors than this error term (such as network environment, hardware, etc) that 
I think to mention the error factor in specific would mislead users to think it 
is more important than it is.
   * Users will already be compensating for this error term when setting their 
limits. If someone observes that the effective rate limit (either for bursts or 
steady-state flows) is too high and affects their cluster health, they will 
lower their configured rate limit to compensate. They may be blaming the wrong 
thing (their hardware vs the rate limit algorithm) but the effect is still the 
same.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] gharris1727 commented on pull request #12806: KAFKA-14345: Fix flakiness with more accurate bound in (Dynamic)ConnectionQuotaTest

Reply via email to