vatsrahul1001 opened a new issue, #63535:
URL: https://github.com/apache/airflow/issues/63535
### Apache Airflow version
main (development)
### If "Other Airflow 3 version" selected, which one?
_No response_
### What happened?
Migration 0101_3_2_0_ui_improvements_for_deadlines downgrade creates a
savepoint for every serialized_dag row via conn.begin_nested() to isolate
per-DAG errors during deadline alert data restoration
```
for serialized_dag_id, dag_id, data, data_compressed in batch_results:
savepoint = conn.begin_nested()
try:
# ... process DAG ...
savepoint.commit()
except Exception as e:
savepoint.rollback()
```
On PostgreSQL, locks acquired within a savepoint are not released when the
savepoint is committed — they persist in the parent transaction until the outer
COMMIT. Since the entire migration runs in a single Alembic transaction, lock
entries accumulate across all DAGs.
PostgreSQL's shared lock table size is max_locks_per_transaction ×
max_connections (default: 64 × 100 = 6,400 entries). Each savepoint acquires
multiple lock entries (relation-level locks on serialized_dag, deadline_alert,
plus the savepoint virtual transaction ID). With thousands of DAGs the lock
table overflows.
Observed behavior:
With 100K serialized_dag entries the migration failed after processing ~20K
DAGs (batch 2 of 11):
`psycopg2.errors.OutOfMemory: out of shared memoryHINT: You might need to
increase max_locks_per_transaction.CONTEXT: parallel worker`
This was followed by a secondary RecursionError as SQLAlchemy tried to
unwind 970+ nested savepoint objects via recursive _cancel() calls during
rollback:
`
RecursionError: maximum recursion depth exceeded`
Deployments with more than ~1K–2K DAGs will likely hit this on default
PostgreSQL settings. The exact threshold depends on how many lock entries each
savepoint consumes and the configured max_locks_per_transaction and
max_connections.
### What you think should happen instead?
_No response_
### How to reproduce
Steps to reproduce:
- Start Airflow on 3.1.8 with PostgreSQL backend (default
max_locks_per_transaction = 64)
- Create 2K+ DAGs into serialized_dag with deadline alert data
- Upgrade to 3.2.0 (airflow db migrate)
- Run airflow db downgrade -n 3.1.8
- Migration 0101 fails with out of shared memory error
### Operating System
Macos
### Versions of Apache Airflow Providers
_No response_
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
_No response_
### Anything else?
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]