ZackUhlenhuth commented on PR #31414:
URL: https://github.com/apache/airflow/pull/31414#issuecomment-1839139017
Hello all, I seem to be running into a new deadlock issue due to this
change. I've changed my dag name to DAGNAME in output below:
```
[[34m2023-12-03T12:43:48.487+0000[0m]
{{[34mscheduler_job_runner.py:[0m1426}} INFO[0m - DAG DAGNAME scheduling was
skipped, probably because the DAG record was locked[0m
[2023-12-03T12:43:56.335+0000] {{base.py:73}} INFO - Using connection ID
'aws_default' for task execution.
[[34m2023-12-03T12:43:58.509+0000[0m] {{[34mdagrun.py:[0m632}} ERROR[0m
- Marking run <DagRun DAGNAME @ 2023-11-26 12:00:00+00:00:
scheduled__2023-11-26T12:00:00+00:00, state:running, queued_at: 2023-12-03
12:00:00.743211+00:00. externally triggered: False> failed[0m
```
It seems that the scheduler fails to acquire the Dag Record lock, and then
fails the entire dag as a result.
Is someone able to explain the cases under which the scheduler would fail to
acquire the Dag Record lock, and if there is some configuration I can use to
mitigate? I was not seeing such failures before upgrading from 2.5.1 to 2.7.2
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]