membphis commented on PR #13578:
URL: https://github.com/apache/apisix/pull/13578#issuecomment-4805713517

   I think this should be addressed before merge: the cache key is currently 
computed before `ai-proxy` builds the final upstream request, so it does not 
cover all server-side mutations that can affect the LLM response.
   
   `ai-cache` computes `ctx.ai_cache_fingerprint` in `access` from the client 
request body and then scopes the key mostly by 
instance/model/route/consumer/vars. However, `ai-proxy` later mutates the 
request that is actually sent upstream through `options`, 
`override.llm_options`, and `override.request_body`. Those fields can change 
response-determining parameters such as `temperature`, `top_p`, `max_tokens`, 
tools, provider-specific body fields, etc.
   
   A concrete failure case:
   
   1. Two routes enable `ai-cache` with `cache_key.share_across_routes = true`.
   2. The client sends the same prompt and model to both routes.
   3. Route A configures `ai-proxy.options.temperature = 0.2`.
   4. Route B configures `ai-proxy.options.temperature = 0.8`, or uses 
`override.request_body` / `override.llm_options` to change the final upstream 
request.
   5. Route A warms the cache first.
   6. Route B can hit Route A's cached response even though the actual upstream 
LLM request should be different.
   
   This breaks the exact-cache contract because the key no longer represents 
the final request that produced the cached response. It can return a response 
generated under a different route-side AI configuration.
   
   Suggested fix: base the cache key on the final canonical upstream request 
body after `ai-proxy` has applied provider conversion, `options`, 
`override.llm_options`, and `override.request_body`; or include a canonicalized 
representation of all response-determining AI instance configuration in the 
key. Please also add regression tests covering shared-cache routes with 
identical client bodies but different server-side `options` / `override` values.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to