Re: [PR] feat: add fallback mechanism for specific error codes in ai-proxy-multi [apisix]

via GitHub Tue, 02 Sep 2025 01:58:27 -0700


kayx23 commented on code in PR #12571:
URL: https://github.com/apache/apisix/pull/12571#discussion_r2315403911



##########
docs/en/latest/plugins/ai-rate-limiting.md:
##########
@@ -413,9 +413,9 @@ X-AI-RateLimit-Reset-deepseek-instance: 0
 
 ### Configure Instance Priority and Rate Limiting
 
-The following example demonstrates how you can configure two models with 
different priorities and apply rate limiting on the instance with a higher 
priority. In the case where `fallback_strategy` is set to 
`instance_health_and_rate_limiting`, the Plugin should continue to forward 
requests to the low priority instance once the high priority instance's rate 
limiting quota is fully consumed.
+The following example demonstrates how you can configure two models with 
different priorities and apply rate limiting on the instance with a higher 
priority. In the case where `fallback_strategy` is set to `["rate_limiting"]`, 
the Plugin should continue to forward requests to the low priority instance 
once the high priority instance's rate limiting quota is fully consumed.

Review Comment:
   Sounds good. Just a reminder that there's more than one example to update.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat: add fallback mechanism for specific error codes in ai-proxy-multi [apisix]

Reply via email to