doowhtron opened a new issue #13668:
URL: https://github.com/apache/airflow/issues/13668


   **Apache Airflow version**: 2.0.0
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
   
   **Environment**:
   
   - **Cloud provider or hardware configuration**: tencent cloud
   - **OS** (e.g. from /etc/os-release): centos7
   - **Kernel** (e.g. `uname -a`): 3.10
   - **Install tools**:
   - **Others**: mysql8
   
   **What happened**:
   
   Scheduler dies when I try to restart it. And the logs are as follows:
   ```
   {2021-01-14 13:29:05,424} {{scheduler_job.py:1293}} ERROR - Exception when 
executing SchedulerJob._run_scheduler_loop
   Traceback (most recent call last):
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/base.py",
 line 1276, in _execute_context
       self.dialect.do_execute(
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/default.py",
 line 609, in do_execute
       cursor.execute(statement, parameters)
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/MySQLdb/cursors.py",
 line 209, in execute
       res = self._query(query)
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/MySQLdb/cursors.py",
 line 315, in _query
       db.query(q)
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/MySQLdb/connections.py",
 line 239, in query
       _mysql.connection.query(self, query)
   MySQLdb._exceptions.OperationalError: (1213, 'Deadlock found when trying to 
get lock; try restarting transaction')
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 1275, in _execute
       self._run_scheduler_loop()
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 1349, in _run_scheduler_loop
       self.adopt_or_reset_orphaned_tasks()
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/utils/session.py",
 line 65, in wrapper
       return func(*args, session=session, **kwargs)
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 1758, in adopt_or_reset_orphaned_tasks
       session.query(SchedulerJob)
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/orm/query.py",
 line 4063, in update
       update_op.exec_()
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py",
 line 1697, in exec_
       self._do_exec()
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py",
 line 1895, in _do_exec
       self._execute_stmt(update_stmt)
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/orm/persistence.py",
 line 1702, in _execute_stmt
       self.result = self.query._execute_crud(stmt, self.mapper)
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/orm/query.py",
 line 3568, in _execute_crud
       return conn.execute(stmt, self._params)
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/base.py",
 line 1011, in execute
       return meth(self, multiparams, params)
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/sql/elements.py",
 line 298, in _execute_on_connection
       return connection._execute_clauseelement(self, multiparams, params)
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/base.py",
 line 1124, in _execute_clauseelement
       ret = self._execute_context(
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/base.py",
 line 1316, in _execute_context
       self._handle_dbapi_exception(
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/base.py",
 line 1510, in _handle_dbapi_exception
       util.raise_(
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/util/compat.py",
 line 182, in raise_
       raise exception
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/base.py",
 line 1276, in _execute_context
       self.dialect.do_execute(
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/sqlalchemy/engine/default.py",
 line 609, in do_execute
       cursor.execute(statement, parameters)
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/MySQLdb/cursors.py",
 line 209, in execute
       res = self._query(query)
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/MySQLdb/cursors.py",
 line 315, in _query
       db.query(q)
     File 
"/home/app/.pyenv/versions/3.8.1/envs/airflow-py381/lib/python3.8/site-packages/MySQLdb/connections.py",
 line 239, in query
       _mysql.connection.query(self, query)
   sqlalchemy.exc.OperationalError: (MySQLdb._exceptions.OperationalError) 
(1213, 'Deadlock found when trying to get lock; try restarting transaction')
   [SQL: UPDATE job SET state=%s WHERE job.state = %s AND job.latest_heartbeat 
< %s]
   [parameters: ('failed', 'running', datetime.datetime(2021, 1, 14, 5, 28, 35, 
157941))]
   (Background on this error at: http://sqlalche.me/e/13/e3q8)
   {2021-01-14 13:29:06,435} {{process_utils.py:95}} INFO - Sending 
Signals.SIGTERM to GPID 6293
   {2021-01-14 13:29:06,677} {{process_utils.py:61}} INFO - Process 
psutil.Process(pid=6318, status='terminated') (6318) terminated with exit code 
None
   {2021-01-14 13:29:06,767} {{process_utils.py:201}} INFO - Waiting up to 5 
seconds for processes to exit...
   {2021-01-14 13:29:06,850} {{process_utils.py:61}} INFO - Process 
psutil.Process(pid=6320, status='terminated') (6320) terminated with exit code 
None
   {2021-01-14 13:29:06,850} {{process_utils.py:61}} INFO - Process 
psutil.Process(pid=6319, status='terminated') (6319) terminated with exit code 
None
   {2021-01-14 13:29:06,858} {{process_utils.py:61}} INFO - Process 
psutil.Process(pid=6321, status='terminated') (6321) terminated with exit code 
None
   {2021-01-14 13:29:06,864} {{process_utils.py:201}} INFO - Waiting up to 5 
seconds for processes to exit...
   {2021-01-14 13:29:06,876} {{process_utils.py:61}} INFO - Process 
psutil.Process(pid=6293, status='terminated') (6293) terminated with exit code 0
   {2021-01-14 13:29:06,876} {{scheduler_job.py:1296}} INFO - Exited execute 
loop
   ```
   
   **What you expected to happen**:
   
   Schdeduler should not die.
   
   **How to reproduce it**:
   
   I don't know how to reproduce it
   
   **Anything else we need to know**:
   
   I just upgrade airflow from 1.10.14. Now I try to fix it temporarily by 
catching the exception in scheduler_job.py
   
   ```python
   for dag_run in dag_runs:
       try:
           self._schedule_dag_run(dag_run, 
active_runs_by_dag_id.get(dag_run.dag_id, set()), session)
       except Exception as e:
           self.log.exception(e)
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to