lhotari commented on PR #25061: URL: https://github.com/apache/pulsar/pull/25061#issuecomment-3642815038
> I'm using a forked 3.0.x version, but this behavior is not tied to any specific branch. The issue is not related to the limiter algorithm itself. The problem occurs in the dispatcher, which reads entries from BookKeeper and sends them to consumers. > > When multiple topics share a global dispatch limiter, such as `brokerDispatchRateLimiter`, each dispatcher can observe the same available token value concurrently and proceed in parallel. This causes deterministic overshoot because multiple dispatchers effectively consume the same tokens. This differs from the normal temporary smoothing overshoot allowed by AsyncTokenBucket. The root cause is simply how dispatchers interact with the shared limiter. Yes, this is a weakness in the current solution. However, it would be useful to share what the current impact of temporarily going over the limit. It seems that also in 3.0.x, the Dispatch rate limiter implementation will smoothen out the acquired tokens over time, so the behavior is similar as with PIP-322 changes. > Regarding dropping entries before sending to consumers, if entries are not dropped, they will still be delivered to the client, which would effectively bypass the limiter. In that case, the limiter becomes meaningless. > > I understand that rate limiters are primarily for capacity management and that enforcing them may introduce some performance overhead. However, since a limiter is explicitly configured, the broker should enforce the dispatch rate, even if this temporarily blocks or delays some dispatches. My approach is similar in spirit to the suggested pre-estimate and adjust method, but instead of estimating, it consumes tokens before dispatch. This ensures that concurrent dispatchers cannot overshoot the configured limit while preserving the intended behavior of the rate limiter. The optimal solution would be to acquire tokens based on what consumers would be available to read and do an adjustment after sending the entries. In the adjustment phase, it would be possible to consume or return tokens. This would address the concerns that I have about performance impacts when entries are read and then dropped before sending out to consumers. Just wondering if you are interested in adjusting the solution to this direction? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
