John Arnold created AIRFLOW-2340:
------------------------------------
Summary: SQLalchemy pessimistic connection handling not working
Key: AIRFLOW-2340
URL: https://issues.apache.org/jira/browse/AIRFLOW-2340
Project: Apache Airflow
Issue Type: Bug
Components: scheduler
Affects Versions: 1.9.0
Reporter: John Arnold
Attachments: airflow_traceback.txt
Our scheduler keeps crashing, about once a day. It seems to be triggered by a
failure to connect to the postgresql database, but then it doesn't recover and
crashes the scheduler over and over.
The scheduler runs in a container in our environment, so after several
container restarts, docker gives up and the container stays down.
Airflow should be able to recover from a connection failure without blowing up
the container altogether. Perhaps some exponential backoff is needed?
See attached log from the scheduler.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)