aicam opened a new issue, #4190:
URL: https://github.com/apache/texera/issues/4190

   ### Task Summary
   
   # Proposal: Unify Proxy Architecture with Envoy Gateway
   
   ## Summary
   We are proposing a migration from our current dual-proxy setup (Ingress 
Nginx + Envoy) to a unified **Envoy Gateway** architecture. This change aims to 
simplify our infrastructure, adopt the modern Kubernetes Gateway API, and 
natively support the dynamic routing requirements of our ephemeral Computing 
Units.
   
   ## Motivation & Current Limitations
   Our system currently relies on **Ingress Nginx** for static routing and a 
separate **Envoy** instance for dynamic routing. While Ingress Nginx is a 
standard solution, it presents significant architectural limitations for our 
specific workload, particularly regarding the **Computing Units** in the 
`texera-workflow-computing-unit-pool` namespace.
   
   ### The Problems with the Current Stack
   1.  **Lack of Dynamic Routing:** Ingress Nginx operates on a configuration 
reload model. Every time an upstream service changes, Nginx must reload its 
configuration. This is inefficient for our Computing Units, which are highly 
transient (dynamically created and terminated).
   2.  **Dual-Proxy Complexity:** To bypass Nginx's limitations, we introduced 
a secondary Envoy proxy to handle WebSocket and HTTP connections to these 
dynamic units. This resulted in a "double-hop" architecture (User -> Ingress -> 
Envoy -> Compute), increasing latency and operational maintenance.
   3.  **Legacy Architecture:** Nginx's process-based architecture is less 
suited for high-churn service discovery compared to Envoy's modern, threaded 
architecture which utilizes the xDS protocol for seamless, hot-restart-free 
configuration updates.
   
   ## Proposed Solution: Envoy Gateway
   We propose replacing both the Ingress Controller and the standalone Envoy 
proxy with **Envoy Gateway**. 
   
   Envoy Gateway is a Kubernetes-native implementation of the **Gateway API**. 
It manages Envoy proxies as the data plane, allowing us to handle both static 
system routes (Web App, Config Service) and dynamic ephemeral routes (Computing 
Units) in a single, unified layer.
   
   ### Key Benefits
   * **Unified Architecture:** Eliminates the maintenance overhead of managing 
two different proxy technologies.
   * **Native Dynamic Routing:** Envoy natively supports service discovery via 
xDS, allowing it to route to ephemeral pods in the 
`texera-workflow-computing-unit-pool` without the need for constant 
configuration reloads.
   * **Modern Standard:** Adopting the [Kubernetes Gateway 
API](https://gateway-api.sigs.k8s.io/) future-proofs our networking stack with 
standard resources (`Gateway`, `HTTPRoute`) rather than vendor-specific 
annotations.
   * **Protocol Support:** Seamless support for both HTTP/2 and WebSockets, 
which are critical for the interactive nature of our workflow system.
   
   ## Architecture Comparison
   
   ### Current Architecture
   *Traffic flows through Ingress for static routes, but requires a secondary 
Envoy hop for dynamic Compute Units.*
   Current architecture:
   <img width="1119" height="509" alt="Screenshot from 2026-01-29 13-16-45" 
src="https://github.com/user-attachments/assets/b73d1403-f973-4153-804e-b330c2824c19";
 />
   
   ### Target Architecture
   *A single Envoy Gateway layer handles all ingress traffic, routing directly 
to services and dynamically discovering Compute Units.*
   <img width="1253" height="559" alt="Screenshot from 2026-01-29 13-21-07" 
src="https://github.com/user-attachments/assets/1006d958-3c16-4acd-a902-30e4501e6361";
 />
   
   
   ### Priority
   
   P2 – Medium
   
   ### Task Type
   
   - [ ] Code Implementation
   - [ ] Documentation
   - [ ] Refactor / Cleanup
   - [ ] Testing / QA
   - [ ] DevOps / Deployment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to