nic-6443 opened a new pull request, #13565:
URL: https://github.com/apache/apisix/pull/13565

   ### Description
   
   When an upstream LLM provider returns `429` or a `5xx`, `ai-proxy` / 
`ai-proxy-multi` returned only the status code and closed the connection before 
reading the body. The provider's error payload (rate-limit details, validation 
errors, etc.) was discarded — the client got an empty body and nothing was 
logged, which makes upstream failures hard to diagnose.
   
   This reads the upstream error body before closing the connection and routes 
it to where it is useful:
   
   - `ai-proxy-multi` logs the error body when it falls back to another 
instance, since that failed attempt's body never reaches the client (a later 
attempt responds instead).
   - When the request is not retried (single-instance `ai-proxy`, no matching 
`fallback_strategy`, `max_retries` exhausted, or the failure took longer than 
`retry_on_failure_within_ms`), the upstream status code and error body are 
returned to the client, preserving the upstream `Content-Type`.
   
   Error bodies are small, so the body is read with a single `read_body()`; no 
extra config is introduced.
   
   Fixes #13501
   
   ### Checklist
   
   - [x] I have explained the need for this PR and the problem it solves
   - [x] I have explained the changes or the new features added to this PR
   - [x] I have added tests corresponding to this change
   - [x] I have updated the documentation to reflect this change
   - [x] I have verified that this change is backward compatible
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to