milton0825 commented on issue #7935:
URL: https://github.com/apache/airflow/issues/7935#issuecomment-784638031


   Have a theory of why the Airflow scheduler may stuck at 
[CeleryExecutor._send_tasks_to_celery](https://github.com/apache/airflow/blob/master/airflow/executors/celery_executor.py#L331-L333).
   
   The size of the return value from `send_task_to_executor` may be huge as the 
traceback is included in case of failure and looks like it is a known bug [1] 
in cpython that huge output can cause deadlock in `multiprocessing.Pool`.
   
   For example, the following code easily deadlock on Python 3.6.3:
   ```
   import multiprocessing
   import time
   
   def f(x):
       return ' ' * 1000000
   if __name__ == '__main__':
       with multiprocessing.Pool(1) as p:
           r = p.map(f, ('hi'*100000))
   ```
   
   [1] https://bugs.python.org/issue35267


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to