lhotari commented on PR #25061: URL: https://github.com/apache/pulsar/pull/25061#issuecomment-3640705498
> @lhotari I believe this issue exists in all versions. The AsyncTokenBucket implementation itself is correct, but the dispatcher was not using the limiter correctly. Which version did you observe the behavior in? After PIP-322 there were a few issues and the last one was fixed in #24012. Please validate that the problem of imbalance really exists in practice. The rate limiter will smoothen the traffic and possible imbalance is not noticeable in most cases. Is your experience different? In Pulsar, since there isn't really a scheduler that all dispatchers would get a "fair share" of a shared rate limiter, it's possible that it causes imbalance. However that problem wouldn't be solved with the solution in this PR in any case. > > One possible improvement to the existing solution could be to acquire tokens based on some estimate before reading the entries and adjust the tokens later, after sending (like it's currently performed). > > This approach is workable but not ideal. It still relies on estimating tokens before dispatching, which can lead to inaccurate enforcement when multiple dispatchers share the same limiter. Instead, we can iterate through the entries after they are read and check the rate limit per entry, consuming tokens before dispatching each message. This avoids relying on estimates and prevents overshoot under concurrency. It's part of the design of rate limiters that it can go over the limit temporarily. The AsyncTokenBucket implementation guarantees that the average rate is limited to the desired rate. Dropping entries just before sending them to a consumer is not a great idea for shared subscriptions. It's not a great idea for other subscription types either, since the entries have been unnecessarily read from BookKeeper. The reason to use rate limiters is usually because of capacity management reasons. If the ratelimiter adds performance overhead to the system, it would have negative effects. Therefore, I think that consuming tokens based on an estimate and making an adjustment later would be the correct path forward for improving the accuracy in cases where there's a "thundering herd" type of load increase where a solution for the described problem could possibly matter. If there's a need for improving the accuracy, the "fair sharing" problem should also be addressed in the solution, since currently there's nothing that would ensure that each dispatcher gets about the same "share" of the total limit in shared limiter. This problem has been solved for publish rate limiters with a queue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
