[PR] feat: ai-cache plugin (minimal exact cache for openai-chat) [PR-1] [apisix]

via GitHub Fri, 22 May 2026 09:28:19 -0700


janiussyafiq opened a new pull request, #13424:
URL: https://github.com/apache/apisix/pull/13424


   ### Description
   
   PR-1 of the ai-cache phased series (RFC #13290, supersedes #13308 / #13370): 
a minimal exact-match L1 cache for `openai-chat` backed by Redis. Two phase 
hooks — `access` short-circuits on HIT, `log` schedules an `ngx.timer.at(0)` 
SETEX on 2xx JSON ≤ 1 MiB (cosockets are forbidden in `log_by_lua`). Priority 
`1086` runs above `ai-proxy=1040` so hits skip the upstream entirely. Cache key 
is `sha256({model, messages})` — the rest of the field whitelist, 
effective-body input, scoping, bypass header, and Prometheus arrive in PR-2 → 
PR-5.
   
   `apisix/plugins/ai-providers/base.lua` gets a one-line stash 
(`ctx.llm_raw_response_body = raw_res_body`) so the log phase can see the 
parsed envelope without a `body_filter`.
   
   Fail-open invariant holds: Redis errors degrade to MISS, never 5xx.
   
   #### Which issue(s) this PR fixes:
   
   Refs #13290
   
   ### Checklist
   
   - [x] I have explained the need for this PR and the problem it solves
   - [x] I have explained the changes or the new features added to this PR
   - [x] I have added tests corresponding to this change
   - [x] I have updated the documentation to reflect this change
   - [x] I have verified that this change is backward compatible


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] feat: ai-cache plugin (minimal exact cache for openai-chat) [PR-1] [apisix]

Reply via email to