That’s great! Thanks for your reply! From: Kamil Breguła <[email protected]> Sent: Thursday, December 5, 2019 4:19 PM To: [email protected] Subject: Re: Celery Task Startup Overhead
Hello, This is caused by strict process isolation. Each task is started in a new process, where the Python interpreter is loaded completely anew. This change can help solve some of your problems. https://github.com/apache/airflow/pull/6627 Best regards, Kamil On Thu, Dec 5, 2019 at 9:41 PM Aaron Grubb <[email protected]<mailto:[email protected]>> wrote: Hi everyone, I’ve been testing celery workers with both prefork and eventlet pools and I'm noticing massive startup overhead for simple BashOperators. For example, 20x instances of: BashOperator( task_id='test0', bash_command="echo 'test'", dag=dag) executed concurrently spikes my worker machine to from ~150mb to ~3gb (eventlet) or ~3.5gb (prefork) memory and takes ~50 seconds. Is this an expected artifact of the 20x python executions or is there some way to reduce this? Thanks, Aaron
