leooamaral commented on issue #2498:
URL: 
https://github.com/apache/apisix-ingress-controller/issues/2498#issuecomment-3189996295

   Hello @Baoyuantop,
   
   Sorry for the delay.
   
   While checking the `apisix-ingress-controller` code, I noticed it doesn't 
currently use the `sigs.k8s.io/gateway-api-inference-extension` package. 
   
   **The Gateway API Inference Extension** is a project sponsored by [SIG 
Network](https://github.com/kubernetes/community/blob/master/sig-network/README.md#gateway-api-inference-extension)
 that extends Gateway API to support advanced routing for LLM traffic. It 
introduces:
   - **Two new CRDs:** `InferenceModel` and `InferencePool`,
   -  **A load balancing algorithm** optimized for inference workloads,
   -  **Controllers** to support advanced routing of LLM traffic. 
   But to accomplish this the project should support:
   - 
[ext-proc](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_proc_filter),
   - [Gateway API](https://github.com/kubernetes-sigs/gateway-api).
   
   The Inference Extension leverages metrics from underlying LLM to make 
smarter routing and load balancing decisions.
   It would be valuable for `apisix-ingress-controller` to integrate with this 
extension to better support AI/LLM workloads.
   
   Would it be possible for adding this support?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@apisix.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to