membphis commented on PR #13578: URL: https://github.com/apache/apisix/pull/13578#issuecomment-4787966538
I rechecked the latest update. The no-`ai-proxy` guard looks addressed now: `ai-cache.access` checks `ctx.picked_ai_instance` first, uses the shared `ai-protocols.binding` `fail_mode`, and the tests cover both the default bypass behavior and `fail_mode=error`. One merge-blocking cache-key issue still remains: ### [P1] `share_across_routes` can still reuse a response across different effective models on plain `ai-proxy` routes The new scope includes `ctx.picked_ai_instance_name`, which fixes the `ai-proxy-multi` case because different picked instances have different names. However, for the plain `ai-proxy` plugin, `ctx.picked_ai_instance_name` is only `ai-proxy-<provider>` (for example, `ai-proxy-openai`). It does not include the route-level effective model. The fingerprint still uses the requested model from `ctx.var.request_llm_model or body.model`, not the effective model selected by `ai-proxy` from `ai_instance.options.model or request_model`. This leaves a real collision when cache sharing is enabled across routes: 1. Route A uses `ai-proxy` with `provider=openai`, `options.model=gpt-4o`, and `ai-cache.cache_key.share_across_routes=true`. 2. Route B uses `ai-proxy` with `provider=openai`, `options.model=gpt-4o-mini`, and the same Redis/cache settings. 3. The client sends the same body to both routes, especially if the body omits `model` or carries the same requested `model`. 4. Both routes compute the same scope (`instance=ai-proxy-openai`) and the same fingerprint, so Route B can replay Route A's cached response even though the effective upstream model is different. This also contradicts the new docs, which say that even with `share_across_routes` enabled, responses from different upstream models or providers are kept in separate cache entries. Please include the effective model in the key/scope for the plain `ai-proxy` case, for example by deriving it from `ctx.picked_ai_instance.options.model or body.model` before lookup. It would also be good to add a regression test with two plain `ai-proxy` routes using the same provider, different `options.model`, the same Redis, and `share_across_routes=true`; the second route should be a `MISS`, not a `HIT`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
