nic-6443 commented on code in PR #13565:
URL: https://github.com/apache/apisix/pull/13565#discussion_r3419740941
##########
apisix/plugins/ai-proxy/base.lua:
##########
@@ -298,15 +313,24 @@ function _M.before_proxy(conf, ctx, on_error)
core.response.set_response_source(ctx, "upstream")
if res.status == 429 or (res.status >= 500 and res.status < 600)
then
+ -- Read the upstream error body before closing so the
provider's
+ -- error details survive: logged on fallback (see
retry_on_error)
+ -- and returned to the client when no retry happens.
+ local error_body = read_upstream_error_body(res)
+ local content_type = res.headers["Content-Type"]
+ if content_type then
+ core.response.set_header("Content-Type", content_type)
+ end
Review Comment:
Good catch on the theoretical leak, but I think it's harmless in practice so
I'll leave it. The common no-retry case (single instance, or the final 429/5xx
in a fallback chain) goes through this branch and sets the correct
`Content-Type`. A stale header can only survive when a *later* retry attempt
returns without setting one — a transport error, a no-body 500, etc. — and
those paths all return an empty body, so at most it's a content-type label on
an empty error response. And since LLM error bodies are `application/json` in
practice, even the leaked value matches what you'd want.
##########
apisix/plugins/ai-proxy/base.lua:
##########
@@ -298,15 +313,24 @@ function _M.before_proxy(conf, ctx, on_error)
core.response.set_response_source(ctx, "upstream")
if res.status == 429 or (res.status >= 500 and res.status < 600)
then
+ -- Read the upstream error body before closing so the
provider's
+ -- error details survive: logged on fallback (see
retry_on_error)
+ -- and returned to the client when no retry happens.
+ local error_body = read_upstream_error_body(res)
+ local content_type = res.headers["Content-Type"]
+ if content_type then
+ core.response.set_header("Content-Type", content_type)
+ end
if res._t0 then
apisix_upstream.update_upstream_state({
response_time = (ngx_now() - res._t0) * 1000,
+ response_length = res._upstream_bytes or 0,
})
end
if res._httpc then
res._httpc:close()
end
- return res.status
+ return res.status, error_body
Review Comment:
Good catch on the theoretical leak, but I think it's harmless in practice so
I'll leave it. The common no-retry case (single instance, or the final 429/5xx
in a fallback chain) goes through this branch and sets the correct
`Content-Type`. A stale header can only survive when a *later* retry attempt
returns without setting one — a transport error, a no-body 500, etc. — and
those paths all return an empty body, so at most it's a content-type label on
an empty error response. And since LLM error bodies are `application/json` in
practice, even the leaked value matches what you'd want.
##########
apisix/plugins/ai-proxy-multi.lua:
##########
@@ -569,11 +569,17 @@ local function retry_on_error(ctx, conf, code)
end
end
+ local failed_instance = ctx.picked_ai_instance_name
local name, ai_instance, err = pick_ai_instance(ctx, conf)
if err then
core.log.error("failed to pick new AI instance: ", err)
return 502
end
+ -- The failed attempt's body never reaches the client (a later attempt
+ -- responds instead), so surface the upstream error here for
diagnostics.
+ core.log.warn("ai instance ", failed_instance, " returned status ",
code,
+ ", falling back to ", name, ". upstream error body: ",
+ body or "")
Review Comment:
This branch only handles 429/5xx — auth failures that could echo an API key
(401/403) go through the normal `parse_response` path, not here, so these
bodies are rate-limit / server-error text rather than credentials. There's also
no response-body redaction anywhere in the codebase (`log_sanitize` only
redacts request auth via `redact_params`, and `llm_response_text` is logged
as-is), and the whole point of this fix is to surface the provider's diagnostic
body — truncating or masking would corrupt it and give false confidence. The
same body is returned to the client that made the request, so it isn't newly
exposed.
##########
t/plugin/ai-proxy.t:
##########
@@ -1263,3 +1263,65 @@ got token usage from ai service:
}
--- response_body
OK: auth.query is clean
+
+
+
+=== TEST 38: set route to an upstream that returns 5xx with an error body
+--- config
+ location /t {
+ content_by_lua_block {
+ local t = require("lib.test_admin").test
+ local code, body = t('/apisix/admin/routes/1',
+ ngx.HTTP_PUT,
+ [[{
+ "uri": "/anything",
+ "plugins": {
+ "ai-proxy": {
+ "provider": "openai-compatible",
+ "auth": {
+ "header": {
+ "Authorization": "Bearer token"
+ }
+ },
+ "options": {
+ "model": "custom"
+ },
+ "override": {
+ "endpoint":
"http://127.0.0.1:6725/v1/chat/completions"
+ },
+ "ssl_verify": false
+ }
+ }
+ }]]
+ )
+
+ if code >= 300 then
+ ngx.status = code
+ end
+ ngx.say(body)
+ }
+ }
+--- response_body
+passed
+
+
+
+=== TEST 39: single-instance ai-proxy returns the upstream 5xx error body to
the client
+--- http_config
+ server {
+ server_name internal_error;
+ listen 6725;
+ default_type 'application/json';
+ location / {
+ content_by_lua_block {
+ ngx.status = 500
+ ngx.say([[{ "error": {"message":"upstream boom"}}]])
+ return
+ }
+ }
+ }
+--- request
+POST /anything
+{ "messages": [ { "role": "user", "content": "What is 1+1?"} ] }
+--- error_code: 500
+--- response_body_like: upstream boom
Review Comment:
Good idea — added a `Content-Type: application/json` assertion to TEST 39 in
9840b94.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]