janiussyafiq commented on PR #13578:
URL: https://github.com/apache/apisix/pull/13578#issuecomment-4806353252

   @membphis 
   Addressed all the previous concerns. The three cache-key reports were one 
root cause: the key was computed in `access` from the client request, but the 
response is determined by the *effective* upstream request `ai-proxy` builds 
later in `before_proxy` (after `ai-cache` runs), so server-side 
`options`/`override` never reached the key.
   
   The fingerprint now identifies that effective request. Since `final_body = 
f(client_body, protocol, instance{provider, options, override})` is 
deterministic, it hashes those inputs:
   
   - **client:** protocol, messages, params
   - **effective:** provider, effective model (`options.model or body.model`), 
`options`, `override.llm_options`, `override.request_body` (+ 
`request_body_force_override`), `override.endpoint`
   
   `scope()` is reduced to pure isolation (route / consumer / `include_vars`). 
One rule closes all three cases, the plain-`ai-proxy` effective model, the 
`options`/`override` params, and the earlier instance/provider collision.
   
   Keying off the final upstream body directly isn't feasible: `before_proxy` 
is where that body is built *and* where the upstream call happens, so a 
read-through cache that short-circuits in `access` can't see it but hashing the 
builder's inputs is equivalent.
   
   While reworking this I also closed a related case: on `ai-proxy-multi` 
failover, the fallback instance's response could be written under the 
originally-picked instance's key. The log-phase write is now skipped when the 
serving instance differs from the one the key was computed for. Let me know 
what your view on this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to