leooamaral commented on issue #2498: URL: https://github.com/apache/apisix-ingress-controller/issues/2498#issuecomment-3189996295
Hello @Baoyuantop, Sorry for the delay. While checking the `apisix-ingress-controller` code, I noticed it doesn't currently use the `sigs.k8s.io/gateway-api-inference-extension` package. **The Gateway API Inference Extension** is a project sponsored by [SIG Network](https://github.com/kubernetes/community/blob/master/sig-network/README.md#gateway-api-inference-extension) that extends Gateway API to support advanced routing for LLM traffic. It introduces: - **Two new CRDs:** `InferenceModel` and `InferencePool`, - **A load balancing algorithm** optimized for inference workloads, - **Controllers** to support advanced routing of LLM traffic. But to accomplish this the project should support: - [ext-proc](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_proc_filter), - [Gateway API](https://github.com/kubernetes-sigs/gateway-api). The Inference Extension leverages metrics from underlying LLM to make smarter routing and load balancing decisions. It would be valuable for `apisix-ingress-controller` to integrate with this extension to better support AI/LLM workloads. Would it be possible for adding this support? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@apisix.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org