Baoyuantop commented on issue #13083: URL: https://github.com/apache/apisix/issues/13083#issuecomment-4039295012
Thank you for your detailed proposal. From my perspective, the direction of this proposal is correct. I see some technical details in your PR that need to be confirmed. 1. Health check bypassed — Client preference paths directly match instances, bypassing health check filtering. The instance corresponding to the user-specified model might already be down, but the request is still sent, resulting in a 5xx error. 2. Lack of input validation for the `models` field — If the client sends a non-array type (such as a string or number), `match_client_models()` will directly return a 500 error. This is the gateway; it shouldn't crash due to a malformed request. 3. Appending order of unmatched instances disrupts server priority — After matching, the remaining instances are directly appended without maintaining the original priority order. This means that the priority configured by the administrator becomes invalid as soon as a client request arrives. 4. The `models` field is only removed from the request body when the feature is enabled — It should be unconditionally removed. Otherwise, this non-standard field will be passed through to the upstream LLM, potentially leading to unpredictable behavior. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
