milton0825 edited a comment on issue #7935: URL: https://github.com/apache/airflow/issues/7935#issuecomment-784638031
Have a theory of why the Airflow scheduler may stuck at [CeleryExecutor._send_tasks_to_celery](https://github.com/apache/airflow/blob/master/airflow/executors/celery_executor.py#L331-L333) (our scheduler stuck in a different place 😃). The size of the return value from `send_task_to_executor` may be huge as the traceback is included in case of failure and looks like it is a known bug [1] in cpython that huge output can cause deadlock in `multiprocessing.Pool`. For example, the following code easily deadlock on Python 3.6.3: ``` import multiprocessing import time def f(x): return ' ' * 1000000 if __name__ == '__main__': with multiprocessing.Pool(1) as p: r = p.map(f, ('hi'*100000)) ``` [1] https://bugs.python.org/issue35267 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
