yuqian90 commented on issue #7935:
URL: https://github.com/apache/airflow/issues/7935#issuecomment-843159653


   The same behaviour in my [previous 
comment](https://github.com/apache/airflow/issues/7935#issuecomment-839656436) 
happened again so I took a `py-spy dump` of both the main `airflow scheduler` 
and the child process. When the scheduler was stuck, the main `airflow 
scheduler` is stuck in `celery_executor.py::_send_tasks_to_celery` in 
`__exit__` of `multiprocessing.Pool`. The code suggests `_terminate_pool()` 
method does send a `SIGTERM`. That seems to explain why there's a  `Exiting 
gracefully upon receiving signal 15` in the scheduler log, although it's not 
clear why the `SIGTERM` is sent to the main scheduler process itself. 
   
   The child `airflow scheduler` is stuck in `_send_tasks_to_celery` when 
trying to get the lock of `SimpleQueue`.
   
   
   This is the `py-spy dump` of the main `airflow scheduler` process when it 
got stuck:
   ```
   Python v3.8.7
   
   Thread 0x7FB54794E740 (active): "MainThread"
       poll (multiprocessing/popen_fork.py:27)
       wait (multiprocessing/popen_fork.py:47)
       join (multiprocessing/process.py:149)
       _terminate_pool (multiprocessing/pool.py:729)
       __call__ (multiprocessing/util.py:224)
       terminate (multiprocessing/pool.py:654)
       __exit__ (multiprocessing/pool.py:736)
       _send_tasks_to_celery (airflow/executors/celery_executor.py:331)
       _process_tasks (airflow/executors/celery_executor.py:272)
       trigger_tasks (airflow/executors/celery_executor.py:263)
       heartbeat (airflow/executors/base_executor.py:158)
       _run_scheduler_loop (airflow/jobs/scheduler_job.py:1388)
       _execute (airflow/jobs/scheduler_job.py:1284)
       run (airflow/jobs/base_job.py:237)
       scheduler (airflow/cli/commands/scheduler_command.py:63)
       wrapper (airflow/utils/cli.py:89)
       command (airflow/cli/cli_parser.py:48)
       main (airflow/__main__.py:40)
       <module> (airflow:8)
   ```
   
   This is the `py-spy dump` of the child `airflow scheduler` process when it 
got stuck:
   ```
   Python v3.8.7
   
   Thread 16232 (idle): "MainThread"
       __enter__ (multiprocessing/synchronize.py:95)
       get (multiprocessing/queues.py:355)
       worker (multiprocessing/pool.py:114)
       run (multiprocessing/process.py:108)
       _bootstrap (multiprocessing/process.py:315)
       _launch (multiprocessing/popen_fork.py:75)
       __init__ (multiprocessing/popen_fork.py:19)
       _Popen (multiprocessing/context.py:277)
       start (multiprocessing/process.py:121)
       _repopulate_pool_static (multiprocessing/pool.py:326)
       _repopulate_pool (multiprocessing/pool.py:303)
       __init__ (multiprocessing/pool.py:212)
       Pool (multiprocessing/context.py:119)
       _send_tasks_to_celery (airflow/executors/celery_executor.py:330)
       _process_tasks (airflow/executors/celery_executor.py:272)
       trigger_tasks (airflow/executors/celery_executor.py:263)
       heartbeat (airflow/executors/base_executor.py:158)
       _run_scheduler_loop (airflow/jobs/scheduler_job.py:1388)
       _execute (airflow/jobs/scheduler_job.py:1284)
       run (airflow/jobs/base_job.py:237)
       scheduler (airflow/cli/commands/scheduler_command.py:63)
       wrapper (airflow/utils/cli.py:89)
       command (airflow/cli/cli_parser.py:48)
       main (airflow/__main__.py:40)
       <module> (airflow:8)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to