nic-6443 commented on code in PR #13565:
URL: https://github.com/apache/apisix/pull/13565#discussion_r3419740941


##########
apisix/plugins/ai-proxy/base.lua:
##########
@@ -298,15 +313,24 @@ function _M.before_proxy(conf, ctx, on_error)
             core.response.set_response_source(ctx, "upstream")
 
             if res.status == 429 or (res.status >= 500 and res.status < 600) 
then
+                -- Read the upstream error body before closing so the 
provider's
+                -- error details survive: logged on fallback (see 
retry_on_error)
+                -- and returned to the client when no retry happens.
+                local error_body = read_upstream_error_body(res)
+                local content_type = res.headers["Content-Type"]
+                if content_type then
+                    core.response.set_header("Content-Type", content_type)
+                end

Review Comment:
   Good catch on the theoretical leak, but I think it's harmless in practice so 
I'll leave it. The common no-retry case (single instance, or the final 429/5xx 
in a fallback chain) goes through this branch and sets the correct 
`Content-Type`. A stale header can only survive when a *later* retry attempt 
returns without setting one — a transport error, a no-body 500, etc. — and 
those paths all return an empty body, so at most it's a content-type label on 
an empty error response. And since LLM error bodies are `application/json` in 
practice, even the leaked value matches what you'd want.



##########
apisix/plugins/ai-proxy/base.lua:
##########
@@ -298,15 +313,24 @@ function _M.before_proxy(conf, ctx, on_error)
             core.response.set_response_source(ctx, "upstream")
 
             if res.status == 429 or (res.status >= 500 and res.status < 600) 
then
+                -- Read the upstream error body before closing so the 
provider's
+                -- error details survive: logged on fallback (see 
retry_on_error)
+                -- and returned to the client when no retry happens.
+                local error_body = read_upstream_error_body(res)
+                local content_type = res.headers["Content-Type"]
+                if content_type then
+                    core.response.set_header("Content-Type", content_type)
+                end
                 if res._t0 then
                     apisix_upstream.update_upstream_state({
                         response_time = (ngx_now() - res._t0) * 1000,
+                        response_length = res._upstream_bytes or 0,
                     })
                 end
                 if res._httpc then
                     res._httpc:close()
                 end
-                return res.status
+                return res.status, error_body

Review Comment:
   Good catch on the theoretical leak, but I think it's harmless in practice so 
I'll leave it. The common no-retry case (single instance, or the final 429/5xx 
in a fallback chain) goes through this branch and sets the correct 
`Content-Type`. A stale header can only survive when a *later* retry attempt 
returns without setting one — a transport error, a no-body 500, etc. — and 
those paths all return an empty body, so at most it's a content-type label on 
an empty error response. And since LLM error bodies are `application/json` in 
practice, even the leaked value matches what you'd want.



##########
apisix/plugins/ai-proxy-multi.lua:
##########
@@ -569,11 +569,17 @@ local function retry_on_error(ctx, conf, code)
             end
         end
 
+        local failed_instance = ctx.picked_ai_instance_name
         local name, ai_instance, err = pick_ai_instance(ctx, conf)
         if err then
             core.log.error("failed to pick new AI instance: ", err)
             return 502
         end
+        -- The failed attempt's body never reaches the client (a later attempt
+        -- responds instead), so surface the upstream error here for 
diagnostics.
+        core.log.warn("ai instance ", failed_instance, " returned status ", 
code,
+                      ", falling back to ", name, ". upstream error body: ",
+                      body or "")

Review Comment:
   This branch only handles 429/5xx — auth failures that could echo an API key 
(401/403) go through the normal `parse_response` path, not here, so these 
bodies are rate-limit / server-error text rather than credentials. There's also 
no response-body redaction anywhere in the codebase (`log_sanitize` only 
redacts request auth via `redact_params`, and `llm_response_text` is logged 
as-is), and the whole point of this fix is to surface the provider's diagnostic 
body — truncating or masking would corrupt it and give false confidence. The 
same body is returned to the client that made the request, so it isn't newly 
exposed.



##########
t/plugin/ai-proxy.t:
##########
@@ -1263,3 +1263,65 @@ got token usage from ai service:
     }
 --- response_body
 OK: auth.query is clean
+
+
+
+=== TEST 38: set route to an upstream that returns 5xx with an error body
+--- config
+    location /t {
+        content_by_lua_block {
+            local t = require("lib.test_admin").test
+            local code, body = t('/apisix/admin/routes/1',
+                 ngx.HTTP_PUT,
+                 [[{
+                    "uri": "/anything",
+                    "plugins": {
+                        "ai-proxy": {
+                            "provider": "openai-compatible",
+                            "auth": {
+                                "header": {
+                                    "Authorization": "Bearer token"
+                                }
+                            },
+                            "options": {
+                                "model": "custom"
+                            },
+                            "override": {
+                                "endpoint": 
"http://127.0.0.1:6725/v1/chat/completions";
+                            },
+                            "ssl_verify": false
+                        }
+                    }
+                }]]
+            )
+
+            if code >= 300 then
+                ngx.status = code
+            end
+            ngx.say(body)
+        }
+    }
+--- response_body
+passed
+
+
+
+=== TEST 39: single-instance ai-proxy returns the upstream 5xx error body to 
the client
+--- http_config
+    server {
+        server_name internal_error;
+        listen 6725;
+        default_type 'application/json';
+        location / {
+            content_by_lua_block {
+                ngx.status = 500
+                ngx.say([[{ "error": {"message":"upstream boom"}}]])
+                return
+            }
+        }
+    }
+--- request
+POST /anything
+{ "messages": [ { "role": "user", "content": "What is 1+1?"} ] }
+--- error_code: 500
+--- response_body_like: upstream boom

Review Comment:
   Good idea — added a `Content-Type: application/json` assertion to TEST 39 in 
9840b94.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to