membphis commented on PR #13578:
URL: https://github.com/apache/apisix/pull/13578#issuecomment-4787966538

   I rechecked the latest update. The no-`ai-proxy` guard looks addressed now: 
`ai-cache.access` checks `ctx.picked_ai_instance` first, uses the shared 
`ai-protocols.binding` `fail_mode`, and the tests cover both the default bypass 
behavior and `fail_mode=error`.
   
   One merge-blocking cache-key issue still remains:
   
   ### [P1] `share_across_routes` can still reuse a response across different 
effective models on plain `ai-proxy` routes
   
   The new scope includes `ctx.picked_ai_instance_name`, which fixes the 
`ai-proxy-multi` case because different picked instances have different names. 
However, for the plain `ai-proxy` plugin, `ctx.picked_ai_instance_name` is only 
`ai-proxy-<provider>` (for example, `ai-proxy-openai`). It does not include the 
route-level effective model.
   
   The fingerprint still uses the requested model from 
`ctx.var.request_llm_model or body.model`, not the effective model selected by 
`ai-proxy` from `ai_instance.options.model or request_model`. This leaves a 
real collision when cache sharing is enabled across routes:
   
   1. Route A uses `ai-proxy` with `provider=openai`, `options.model=gpt-4o`, 
and `ai-cache.cache_key.share_across_routes=true`.
   2. Route B uses `ai-proxy` with `provider=openai`, 
`options.model=gpt-4o-mini`, and the same Redis/cache settings.
   3. The client sends the same body to both routes, especially if the body 
omits `model` or carries the same requested `model`.
   4. Both routes compute the same scope (`instance=ai-proxy-openai`) and the 
same fingerprint, so Route B can replay Route A's cached response even though 
the effective upstream model is different.
   
   This also contradicts the new docs, which say that even with 
`share_across_routes` enabled, responses from different upstream models or 
providers are kept in separate cache entries.
   
   Please include the effective model in the key/scope for the plain `ai-proxy` 
case, for example by deriving it from `ctx.picked_ai_instance.options.model or 
body.model` before lookup. It would also be good to add a regression test with 
two plain `ai-proxy` routes using the same provider, different `options.model`, 
the same Redis, and `share_across_routes=true`; the second route should be a 
`MISS`, not a `HIT`.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to