[ 
https://issues.apache.org/jira/browse/AIRFLOW-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16870305#comment-16870305
 ] 

jack commented on AIRFLOW-3080:
-------------------------------

[~amitgh] did you solve this?

> Mysql OperationalError occurs during heartbeat or any DB operation
> ------------------------------------------------------------------
>
>                 Key: AIRFLOW-3080
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3080
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler, worker
>    Affects Versions: 1.10.0
>            Reporter: Amit Ghosh
>            Assignee: Amit Ghosh
>            Priority: Major
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When airflow uses mysql and airflow has many worker instances and no dag was 
> executed for a long time mysql gives "mysql_exceptions.OperationalError".
> Main issue is if connections become stale for a long time, first db request 
> gives this error because mysql marks connection as stale after some time if 
> no connection has happened to db from a given sqlachemy pool. I am working on 
> a fix and will commit it and that should work in case of other databases also.
> 1) Log Text = \{"log":"[2018-09-18 05:33:45,296] {jobs.py:748} ERROR - 
> (_mysql_exceptions.OperationalError) (2005, \"Unknown MySQL server host 
> 'mlp.prod.machine-learning-platform-prod.ms-df-cloudrdbms.prod.walmart.com' 
> (2)\") (Background on this error at: [http://sqlalche.me/e/e3q8])
> ","stream":"stdout","time":"2018-09-18T05:33:45.315547946Z"}
>  
> 2) Log Text = {"log":" raise errorvalue
> ","stream":"stderr","time":"2018-09-15T06:04:35.722310847Z"}
> {"log":"sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) 
> (2013, 'Lost connection to MySQL server during query') [SQL: u'UPDATE job SET 
> latest_heartbeat=%s WHERE job.id = %s'] [parameters: (datetime.datetime(2018, 
> 9, 15, 6, 4, 23, 4294), 345143L)] (Background on this error at: 
> [http://sqlalche.me/e/e3q8])
> ","stream":"stderr","time":"2018-09-15T06:04:35.72232954Z"}
> {"log":"[2018-09-15 06:04:35,844: ERROR/ForkPoolWorker-13] Command 'airflow 
> run dag_2063_baf60054-d0c7-41b2-8009-4d88f773dc79 web_crawl_pipeline 
> 2018-09-14T05:48:42 --local -sd 
> DAGS_FOLDER/1833_workflows/dag_2063_baf60054-d0c7-41b2-8009-4d88f773dc79.py ' 
> returned non-zero exit status 1
> ","stream":"stderr","time":"2018-09-15T06:04:35.847747612Z"}
> {"log":"[2018-09-15 06:04:35,851: ERROR/ForkPoolWorker-13] Task 
> airflow.executors.celery_executor.execute_command[30141a5a-71da-4d28-a829-495aeca3cfa9]
>  raised unexpected: AirflowException('Celery command failed',)
> ","stream":"stderr","time":"2018-09-15T06:04:35.855019453Z"}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to