membphis commented on PR #13578: URL: https://github.com/apache/apisix/pull/13578#issuecomment-4805713517
I think this should be addressed before merge: the cache key is currently computed before `ai-proxy` builds the final upstream request, so it does not cover all server-side mutations that can affect the LLM response. `ai-cache` computes `ctx.ai_cache_fingerprint` in `access` from the client request body and then scopes the key mostly by instance/model/route/consumer/vars. However, `ai-proxy` later mutates the request that is actually sent upstream through `options`, `override.llm_options`, and `override.request_body`. Those fields can change response-determining parameters such as `temperature`, `top_p`, `max_tokens`, tools, provider-specific body fields, etc. A concrete failure case: 1. Two routes enable `ai-cache` with `cache_key.share_across_routes = true`. 2. The client sends the same prompt and model to both routes. 3. Route A configures `ai-proxy.options.temperature = 0.2`. 4. Route B configures `ai-proxy.options.temperature = 0.8`, or uses `override.request_body` / `override.llm_options` to change the final upstream request. 5. Route A warms the cache first. 6. Route B can hit Route A's cached response even though the actual upstream LLM request should be different. This breaks the exact-cache contract because the key no longer represents the final request that produced the cached response. It can return a response generated under a different route-side AI configuration. Suggested fix: base the cache key on the final canonical upstream request body after `ai-proxy` has applied provider conversion, `options`, `override.llm_options`, and `override.request_body`; or include a canonicalized representation of all response-determining AI instance configuration in the key. Please also add regression tests covering shared-cache routes with identical client bodies but different server-side `options` / `override` values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
