Amit Ghosh created AIRFLOW-3080:
-----------------------------------

             Summary: Mysql OperationalError occurs during heartbeat or any DB 
operation
                 Key: AIRFLOW-3080
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3080
             Project: Apache Airflow
          Issue Type: Bug
          Components: scheduler, worker
    Affects Versions: 1.10.0
            Reporter: Amit Ghosh
            Assignee: Amit Ghosh


When airflow uses mysql and airflow has many worker instances and no dag was 
executed for a long time mysql gives "mysql_exceptions.OperationalError".

Main issue is if connections become stale for a long time, first db request 
gives this error because mysql marks connection as stale after some time if no 
connection has happened to db from a given sqlachemy pool. I am working on a 
fix and will commit it and that should work in case of other databases also.

1) Log Text = \{"log":"[2018-09-18 05:33:45,296] {jobs.py:748} ERROR - 
(_mysql_exceptions.OperationalError) (2005, \"Unknown MySQL server host 
'mlp.prod.machine-learning-platform-prod.ms-df-cloudrdbms.prod.walmart.com' 
(2)\") (Background on this error at: [http://sqlalche.me/e/e3q8])

","stream":"stdout","time":"2018-09-18T05:33:45.315547946Z"}

 

2) Log Text = {"log":" raise errorvalue
","stream":"stderr","time":"2018-09-15T06:04:35.722310847Z"}
{"log":"sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) 
(2013, 'Lost connection to MySQL server during query') [SQL: u'UPDATE job SET 
latest_heartbeat=%s WHERE job.id = %s'] [parameters: (datetime.datetime(2018, 
9, 15, 6, 4, 23, 4294), 345143L)] (Background on this error at: 
[http://sqlalche.me/e/e3q8])
","stream":"stderr","time":"2018-09-15T06:04:35.72232954Z"}
{"log":"[2018-09-15 06:04:35,844: ERROR/ForkPoolWorker-13] Command 'airflow run 
dag_2063_baf60054-d0c7-41b2-8009-4d88f773dc79 web_crawl_pipeline 
2018-09-14T05:48:42 --local -sd 
DAGS_FOLDER/1833_workflows/dag_2063_baf60054-d0c7-41b2-8009-4d88f773dc79.py ' 
returned non-zero exit status 1
","stream":"stderr","time":"2018-09-15T06:04:35.847747612Z"}
{"log":"[2018-09-15 06:04:35,851: ERROR/ForkPoolWorker-13] Task 
airflow.executors.celery_executor.execute_command[30141a5a-71da-4d28-a829-495aeca3cfa9]
 raised unexpected: AirflowException('Celery command failed',)
","stream":"stderr","time":"2018-09-15T06:04:35.855019453Z"}

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to