janiussyafiq opened a new pull request, #13424:
URL: https://github.com/apache/apisix/pull/13424
### Description
PR-1 of the ai-cache phased series (RFC #13290, supersedes #13308 / #13370):
a minimal exact-match L1 cache for `openai-chat` backed by Redis. Two phase
hooks — `access` short-circuits on HIT, `log` schedules an `ngx.timer.at(0)`
SETEX on 2xx JSON ≤ 1 MiB (cosockets are forbidden in `log_by_lua`). Priority
`1086` runs above `ai-proxy=1040` so hits skip the upstream entirely. Cache key
is `sha256({model, messages})` — the rest of the field whitelist,
effective-body input, scoping, bypass header, and Prometheus arrive in PR-2 →
PR-5.
`apisix/plugins/ai-providers/base.lua` gets a one-line stash
(`ctx.llm_raw_response_body = raw_res_body`) so the log phase can see the
parsed envelope without a `body_filter`.
Fail-open invariant holds: Redis errors degrade to MISS, never 5xx.
#### Which issue(s) this PR fixes:
Refs #13290
### Checklist
- [x] I have explained the need for this PR and the problem it solves
- [x] I have explained the changes or the new features added to this PR
- [x] I have added tests corresponding to this change
- [x] I have updated the documentation to reflect this change
- [x] I have verified that this change is backward compatible
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]