Re: [PR] feat(ai-lakera-guard): scan LLM responses (direction output/both, non-streaming + streaming) [apisix]

via GitHub Wed, 24 Jun 2026 22:50:27 -0700


nic-6443 commented on code in PR #13606:
URL: https://github.com/apache/apisix/pull/13606#discussion_r3472165714



##########
apisix/plugins/ai-lakera-guard.lua:
##########
@@ -206,4 +215,60 @@ function _M.access(conf, ctx)
 end
 
 
+function _M.lua_body_filter(conf, ctx, headers, body)

Review Comment:
   Stepping back on the framework: this response-scanning path diverges 
structurally from `ai-aliyun-content-moderation`, and it'd be good to converge 
so operators (and maintainers) reason about both moderation plugins the same 
way.
   
   - **Streaming model.** aliyun exposes two stream modes via 
`stream_check_mode`: `final_packet` streams tokens through live and only 
annotates the last packet with `risk_level`, and `realtime` batch-scans and can 
cut mid-stream. Neither buffers-and-withholds the whole response. This PR adds 
a third, hardcoded model — buffer every chunk (emitting `:\n\n` keep-alives), 
scan once at end-of-stream, then block-or-release. That's a genuinely stronger 
*blocking* model, which makes sense for a guard (annotate-only would defeat the 
point), but right now it's private to this plugin and not selectable. Could it 
be modeled as an explicit mode in the same vocabulary as `stream_check_mode` — 
ideally lifted into a shared helper both plugins can call — instead of a 
per-plugin reimplementation?
   
   - **SSE handling.** aliyun goes through `apisix.plugins.ai-transport.sse` 
(`sse.decode`/`sse.encode`) plus the protocol helpers (`proto.is_done_event` / 
`proto.build_done_event`) for stream framing. Here the release/heartbeat path 
hand-rolls a raw `:\n\n` literal and `concat(buffer)` of raw chunks. Reusing 
the `sse` + `build_done_event` helpers would drop the magic constant and keep 
framing consistent across both.
   
   - **Config vocabulary.** aliyun gates with `check_request` / 
`check_response` booleans; this uses a `direction` enum. `direction` is 
arguably cleaner, but two moderation plugins with different config shapes is 
the kind of drift worth a conscious decision now rather than later.
   
   The deny body already reuses `proto.build_deny_response`, so the 
non-streaming path is nicely consistent — it's mainly the streaming/SSE/config 
surface that forks. What do you think about converging these?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat(ai-lakera-guard): scan LLM responses (direction output/both, non-streaming + streaming) [apisix]

Reply via email to