michael-s-molina commented on PR #34826: URL: https://github.com/apache/superset/pull/34826#issuecomment-3849113231
> So I think we need to revisit the architecture to come up with a design
where we can coordinate connection pooling across the whole worker fleet.
Agree with @villebro. To illustrate the problem in more detail:
#### 1. Connection Pool Multiplication
In `SINGLETON` mode, each pod/Celery worker has its own `EngineManager`
instance:
Pod 1: EngineManager → Pool(size=5) → 5 connections
Pod 2: EngineManager → Pool(size=5) → 5 connections
Pod 3: EngineManager → Pool(size=5) → 5 connections
...
With 10 pods and pool size 5, you could hit 50 connections to the
database, not 5. This can easily exhaust
database connection limits, especially for managed databases with strict
limits.
#### 2. SSH Tunnel Multiplication
Same problem with SSH tunnels:
```
# Each pod creates its own tunnel cache
self._tunnels: dict[TunnelKey, SSHTunnelForwarder] = {}
```
If you have 20 Celery workers across multiple pods, you could have 20
separate SSH tunnels to the same bastion
host. Many SSH servers limit concurrent connections per user (often 10 by
default).
#### 3. Celery Prefork Workers Make It Worse
Celery's default prefork model spawns multiple worker processes:
Pod 1:
└── Celery Worker (prefork, concurrency=4)
├── Process 1 → EngineManager → Pool
├── Process 2 → EngineManager → Pool
├── Process 3 → EngineManager → Pool
└── Process 4 → EngineManager → Pool
Each forked process gets its own memory space, so connection pools aren't
shared even within a single pod.
#### 4. No Cluster-Wide Coordination
The cache keys are deterministic but there's no coordination:
```
# manager.py:258
engine_key = self._get_engine_key(database, catalog, schema, source,
user_id)
```
Each process independently decides to create connections without knowing
cluster-wide state.
#### What's Missing for True Connection Pooling
For distributed connection pooling, you'd typically need:
1. External connection pooler like PgBouncer, ProxySQL, or cloud-native
poolers (RDS Proxy, Cloud SQL Proxy)
2. Centralized state via Redis or similar for tunnel coordination
3. Connection limits per-pod that account for cluster size
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
