yuqian90 commented on issue #7935: URL: https://github.com/apache/airflow/issues/7935#issuecomment-843159653
The same behaviour in my [previous comment](https://github.com/apache/airflow/issues/7935#issuecomment-839656436) happened again so I took a `py-spy dump` of both the main `airflow scheduler` and the child process. When the scheduler was stuck, the main `airflow scheduler` is stuck in `celery_executor.py::_send_tasks_to_celery` in `__exit__` of `multiprocessing.Pool`. The code suggests `_terminate_pool()` method does send a `SIGTERM`. That seems to explain why there's a `Exiting gracefully upon receiving signal 15` in the scheduler log, although it's not clear why the `SIGTERM` is sent to the main scheduler process itself. The child `airflow scheduler` is stuck in `_send_tasks_to_celery` when trying to get the lock of `SimpleQueue`. This is the `py-spy dump` of the main `airflow scheduler` process when it got stuck: ``` Python v3.8.7 Thread 0x7FB54794E740 (active): "MainThread" poll (multiprocessing/popen_fork.py:27) wait (multiprocessing/popen_fork.py:47) join (multiprocessing/process.py:149) _terminate_pool (multiprocessing/pool.py:729) __call__ (multiprocessing/util.py:224) terminate (multiprocessing/pool.py:654) __exit__ (multiprocessing/pool.py:736) _send_tasks_to_celery (airflow/executors/celery_executor.py:331) _process_tasks (airflow/executors/celery_executor.py:272) trigger_tasks (airflow/executors/celery_executor.py:263) heartbeat (airflow/executors/base_executor.py:158) _run_scheduler_loop (airflow/jobs/scheduler_job.py:1388) _execute (airflow/jobs/scheduler_job.py:1284) run (airflow/jobs/base_job.py:237) scheduler (airflow/cli/commands/scheduler_command.py:63) wrapper (airflow/utils/cli.py:89) command (airflow/cli/cli_parser.py:48) main (airflow/__main__.py:40) <module> (airflow:8) ``` This is the `py-spy dump` of the child `airflow scheduler` process when it got stuck: ``` Python v3.8.7 Thread 16232 (idle): "MainThread" __enter__ (multiprocessing/synchronize.py:95) get (multiprocessing/queues.py:355) worker (multiprocessing/pool.py:114) run (multiprocessing/process.py:108) _bootstrap (multiprocessing/process.py:315) _launch (multiprocessing/popen_fork.py:75) __init__ (multiprocessing/popen_fork.py:19) _Popen (multiprocessing/context.py:277) start (multiprocessing/process.py:121) _repopulate_pool_static (multiprocessing/pool.py:326) _repopulate_pool (multiprocessing/pool.py:303) __init__ (multiprocessing/pool.py:212) Pool (multiprocessing/context.py:119) _send_tasks_to_celery (airflow/executors/celery_executor.py:330) _process_tasks (airflow/executors/celery_executor.py:272) trigger_tasks (airflow/executors/celery_executor.py:263) heartbeat (airflow/executors/base_executor.py:158) _run_scheduler_loop (airflow/jobs/scheduler_job.py:1388) _execute (airflow/jobs/scheduler_job.py:1284) run (airflow/jobs/base_job.py:237) scheduler (airflow/cli/commands/scheduler_command.py:63) wrapper (airflow/utils/cli.py:89) command (airflow/cli/cli_parser.py:48) main (airflow/__main__.py:40) <module> (airflow:8) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
