shivanshs9 commented on issue #11899:
URL: https://github.com/apache/airflow/issues/11899#issuecomment-735769434


   @ashb Ah sorry for the delay in response. The issue is still occurring, 
unfortunately.
   <details>
   <summary>Scheduler logs</summary>
   
   ```
   [2020-11-30 12:11:01,752] {{scheduler_job.py:1301}} ERROR - Exception when 
executing SchedulerJob._run_scheduler_loop
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", 
line 1277, in _execute_context
       self.dialect.do_execute(
     File 
"/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/default.py",
 line 593, in do_execute
       cursor.execute(statement, parameters)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 
255, in execute
       self.errorhandler(self, exc, value)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/connections.py", line 
50, in defaulterrorhandler
       raise errorvalue
     File 
"/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 
252, in execute
       res = self._query(query)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 
378, in _query
       db.query(q)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/connections.py", line 
280, in query
       _mysql.connection.query(self, query)
   _mysql_exceptions.OperationalError: (1213, 'Deadlock found when trying to 
get lock; try restarting transaction')
   
   The above exception was the direct cause of the following exception:
   
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 1283, in _execute
       self._run_scheduler_loop()
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 1385, in _run_scheduler_loop
       num_queued_tis = self._do_scheduling(session)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 1543, in _do_scheduling
       num_queued_tis = 
self._critical_section_execute_task_instances(session=session)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 1140, in _critical_section_execute_task_instances
       queued_tis = self._executable_task_instances_to_queued(max_tis, 
session=session)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/utils/session.py", 
line 59, in wrapper
       return func(*args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py",
 line 932, in _executable_task_instances_to_queued
       task_instances_to_examine: List[TI] = with_row_locks(
     File 
"/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", 
line 3341, in all
       return list(self)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", 
line 3503, in __iter__
       return self._execute_and_instances(context)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", 
line 3528, in _execute_and_instances
       result = conn.execute(querycontext.statement, self._params)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", 
line 1014, in execute
       return meth(self, multiparams, params)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", 
line 298, in _execute_on_connection
       return connection._execute_clauseelement(self, multiparams, params)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", 
line 1127, in _execute_clauseelement
       ret = self._execute_context(
     File 
"/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", 
line 1317, in _execute_context
       self._handle_dbapi_exception(
     File 
"/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", 
line 1511, in _handle_dbapi_exception
       util.raise_(
     File 
"/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", 
line 178, in raise_
       raise exception
     File 
"/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", 
line 1277, in _execute_context
       self.dialect.do_execute(
     File 
"/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/default.py",
 line 593, in do_execute
       cursor.execute(statement, parameters)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 
255, in execute
       self.errorhandler(self, exc, value)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/connections.py", line 
50, in defaulterrorhandler
       raise errorvalue
     File 
"/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 
252, in execute
       res = self._query(query)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/cursors.py", line 
378, in _query
       db.query(q)
     File 
"/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/connections.py", line 
280, in query
       _mysql.connection.query(self, query)
   sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (1213, 
'Deadlock found when trying to get lock; try restarting transaction')
   [SQL: SELECT task_instance.try_number AS task_instance_try_number, 
task_instance.task_id AS task_instance_task_id, task_instance.dag_id AS 
task_instance_dag_id, task_instance.execution_date AS 
task_instance_execution_date, task_instance.start_date AS 
task_instance_start_date, task_instance.end_date AS task_instance_end_date, 
task_instance.duration AS task_instance_duration, task_instance.state AS 
task_instance_state, task_instance.max_tries AS task_instance_max_tries, 
task_instance.hostname AS task_instance_hostname, task_instance.unixname AS 
task_instance_unixname, task_instance.job_id AS task_instance_job_id, 
task_instance.pool AS task_instance_pool, task_instance.pool_slots AS 
task_instance_pool_slots, task_instance.queue AS task_instance_queue, 
task_instance.priority_weight AS task_instance_priority_weight, 
task_instance.operator AS task_instance_operator, task_instance.queued_dttm AS 
task_instance_queued_dttm, task_instance.queued_by_job_id AS 
task_instance_queued_by_job_id, 
 task_instance.pid AS task_instance_pid, task_instance.executor_config AS 
task_instance_executor_config, task_instance.external_executor_id AS 
task_instance_external_executor_id
   FROM task_instance LEFT OUTER JOIN dag_run ON task_instance.dag_id = 
dag_run.dag_id AND task_instance.execution_date = dag_run.execution_date INNER 
JOIN dag ON task_instance.dag_id = dag.dag_id
   WHERE (dag_run.run_id IS NULL OR dag_run.run_type != %s) AND dag.is_paused = 
0 AND task_instance.state = %s
    LIMIT %s FOR UPDATE]
   [parameters: (<DagRunType.BACKFILL_JOB: 'backfill'>, 'scheduled', 29)]
   (Background on this error at: http://sqlalche.me/e/13/e3q8)
   [2020-11-30 12:11:02,774] {{process_utils.py:95}} INFO - Sending 
Signals.SIGTERM to GPID 50
   [2020-11-30 12:11:12,964] {{process_utils.py:198}} INFO - Terminating child 
PID: 335
   [2020-11-30 12:11:12,964] {{process_utils.py:198}} INFO - Terminating child 
PID: 336
   [2020-11-30 12:11:12,964] {{process_utils.py:201}} INFO - Waiting up to 5 
seconds for processes to exit...
   [2020-11-30 12:11:17,974] {{process_utils.py:214}} INFO - SIGKILL processes 
that did not terminate gracefully
   [2020-11-30 12:11:17,975] {{process_utils.py:216}} INFO - Killing child PID: 
335
   [2020-11-30 12:11:17,979] {{process_utils.py:216}} INFO - Killing child PID: 
336
   [2020-11-30 12:11:18,015] {{process_utils.py:61}} INFO - Process 
psutil.Process(pid=335, status='terminated', started='12:10:59') (335) 
terminated with exit code None
   [2020-11-30 12:11:18,440] {{process_utils.py:61}} INFO - Process 
psutil.Process(pid=336, status='terminated', started='12:11:00') (336) 
terminated with exit code None
   [2020-11-30 12:12:02,785] {{process_utils.py:108}} WARNING - process 
psutil.Process(pid=334, name='airflow schedul', status='sleeping', 
started='12:10:59') did not respond to SIGTERM. Trying SIGKILL
   [2020-11-30 12:12:02,786] {{process_utils.py:108}} WARNING - process 
psutil.Process(pid=50, name='airflow scheduler -- DagFileProcessorManager', 
status='sleeping', started='12:09:58') did not respond to SIGTERM. Trying 
SIGKILL
   [2020-11-30 12:12:02,787] {{process_utils.py:108}} WARNING - process 
psutil.Process(pid=331, name='airflow schedul', status='sleeping', 
started='12:10:58') did not respond to SIGTERM. Trying SIGKILL
   [2020-11-30 12:12:02,801] {{process_utils.py:61}} INFO - Process 
psutil.Process(pid=334, name='airflow schedul', status='terminated', 
started='12:10:59') (334) terminated with exit code None
   [2020-11-30 12:12:02,801] {{process_utils.py:61}} INFO - Process 
psutil.Process(pid=50, name='airflow scheduler -- DagFileProcessorManager', 
status='terminated', exitcode=<Negsignal.SIGKILL: -9>, started='12:09:58') (50) 
terminated with exit code Negsignal.SIGKILL
   [2020-11-30 12:12:02,802] {{process_utils.py:61}} INFO - Process 
psutil.Process(pid=331, name='airflow schedul', status='terminated', 
started='12:10:58') (331) terminated with exit code None
   [2020-11-30 12:12:02,802] {{scheduler_job.py:1304}} INFO - Exited execute 
loop
   ```
   </details>
   
   Airflow version:
   ```
   airflow@ergo-chronos-scheduler-695d46c8d6-qgnvv:/opt/airflow$ airflow version
   [2020-11-30 12:52:49,519] {{plugins_manager.py:283}} INFO - Loading 2 
plugin(s) took 0.86 seconds
   2.0.0b3
   ```
   
   Weirdly, I think the process is being terminated (as in the logs) but it's 
not exactly crashing the enclosing pod. So the container is not being restarted 
either causing the scheduler to not work indefinitely.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to