jizhuozhi commented on issue #12748:
URL: https://github.com/apache/apisix/issues/12748#issuecomment-3525643470

   Thank you for the detailed feedback — totally agree that perfect global 
consistency can’t be achieved without a shared counter or coordination layer.
   
   The proposal here is **not to replace Redis-based global throttling**, but 
rather to introduce a **best-effort adaptive mode** for users who:
   
   * run small or medium clusters where approximate balancing is acceptable;
   * prefer to avoid Redis due to latency, cost, or operational complexity;
   * currently over-limit because local mode doesn’t account for cluster size 
at all.
   
   A few clarifications:
   
   1. **About sticky traffic**
      Yes — when requests for a given key are pinned to a single node (e.g., by 
client keep-alive or session affinity), dividing rate by N will under-deliver 
for that key.
      However, in practice, many APISIX deployments use short-lived 
connections, or load balance at layer 4/7 evenly enough that such 
under-delivery is minor compared to the current *over-limit* behavior.
      In this mode, the goal is **predictable fairness across cluster scale**, 
not strict per-key fairness.
   
   2. **About /data_plane/server_info softness**
      True — that data is soft-state with TTL and lag.
      But for adaptive-local mode, we only need *approximate cluster size* 
within a few seconds of accuracy, which is sufficient for periodic limit 
rebalancing.
      Even if stale, the limiter still works safely — worst case, it 
temporarily over- or under-limits slightly, which is acceptable for a “local + 
adaptive” tradeoff mode.
   
   3. **Positioning**
      This would be an *intermediate option* between:
   
      * `local` (fast, no coordination, but unscaled), and
      * `redis` (accurate, but external).
        It improves the scaling behavior of local mode without introducing 
external dependencies, giving operators more flexibility.
   
   In short, this proposal doesn’t aim for strong global consistency, but for a 
**lightweight, dependency-free approximation** that improves practical scaling 
while keeping latency low.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to