Hi all,

Rate limiting was introduced to Polaris in September 2024 and has seen
several updates since. Recently, a bug was identified regarding how we
handle certain logic paths, and a fix has been proposed in a new PR.

In reviewing the recent PR and the underlying implementation, I’ve
been thinking about a few architectural questions:

1. Correctness and Determinism:
Does the current logic fully capture the deterministic properties
required by the Token Bucket algorithm? For instance, by not
accounting for fractional token regeneration, are we at risk of
drifting from expected behavior?

2. Testing Strategy:
Would it be beneficial to introduce deterministic clock simulations or
property-based verification to our test suite? I’m concerned that our
current reliance on the wall clock might make it difficult to catch
edge cases in a CI environment.

3. Clock Sources:
Since the wall clock can move backward and forward, should we consider
moving toward a strictly monotonic clock to avoid arithmetic overflows
or "infinite token" scenarios?

4. Leveraging Existing Libraries:
Given the complexity of getting concurrency and synchronization right
for high-throughput rate limiting, would it make sense to explore
using Guava’s RateLimiter? Since it's already on our classpath, could
we benefit from its maturity, optimized synchronization and
sub-millisecond precision?

I'm also curious to hear your thoughts on whether we should address
these additional areas:
* Cold Start: Should we be supporting "warm-up" periods for rate limiters?
* Contention: Could we reduce synchronization overhead by leveraging a
more mature implementation's critical paths?

Rather than continuing to patch the custom implementation, what do you
all think about pausing the development on TokenBucket and evaluating
a migration to a more established library like Guava?

Robert

Reply via email to