uplsh580 commented on issue #65818:
URL: https://github.com/apache/airflow/issues/65818#issuecomment-4525328884
We are seeing the same class of deadlock on Airflow 3.1.8 / MySQL with 2
triggerer replicas.
In our case the deadlock happens on a single-row UPDATE through the
`Trigger.submit_event` → `handle_event_submit` path (not the bulk UPDATE paths
in `SchedulerJobRunner.check_trigger_timeouts` / `Trigger.clean_unused` that
#65920 / #65836 target). The exception is not retried anywhere in the call
chain (`triggerer_job_runner.handle_events` → `Trigger.submit_event` →
`handle_event_submit`), so it propagates up to `TriggerRunnerSupervisor.run`
and the triggerer process exits, after which Kubernetes restarts the pod. No
task failure — deferred tasks are picked up by the other triggerer — but the
restart shows up as an alert.
Environment
- Airflow 3.1.8
- MySQL (InnoDB)
- triggerer replicas: 2
- Workload: deferrable operators (Spark application completion event in this
case)
Failing statement
UPDATE task_instance
SET state='scheduled', scheduled_dttm=..., updated_at=..., trigger_id=NULL,
next_kwargs=...
WHERE task_instance.id = '019e511c-c507-74d8-ace0-f4c396cc8ef1'
→ `MySQLdb.OperationalError: (1213, 'Deadlock found when trying to get lock;
try restarting transaction')`
Stack
airflow/jobs/triggerer_job_runner.py:534 run
airflow/jobs/triggerer_job_runner.py:561 handle_events
airflow/models/trigger.py:252 submit_event
airflow/models/trigger.py:422 handle_event_submit
→ session.flush()
Looking at `airflow-core/src/airflow/models/trigger.py` on the 3.1.8 tag,
neither `submit_event` (L239) nor `handle_event_submit` (L394 / L426) carries
`@retry_db_transaction` or any try/except, while several deadlock-sensitive
paths on the scheduler side already do. Wrapping these entry points (or the
per-event loop in `triggerer_job_runner.handle_events`) with
`@retry_db_transaction` would also cover this single-row UPDATE case that the
existing PRs do not address.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]