That’s great! Thanks for your reply!

From: Kamil Breguła <[email protected]>
Sent: Thursday, December 5, 2019 4:19 PM
To: [email protected]
Subject: Re: Celery Task Startup Overhead

Hello,

This is caused by strict process isolation. Each task is started in a new 
process, where the Python interpreter is loaded completely anew.
This change can help solve some of your problems.
https://github.com/apache/airflow/pull/6627

Best regards,
Kamil

On Thu, Dec 5, 2019 at 9:41 PM Aaron Grubb 
<[email protected]<mailto:[email protected]>> wrote:
Hi everyone,

I’ve been testing celery workers with both prefork and eventlet pools and I'm 
noticing massive startup overhead for simple BashOperators. For example, 20x 
instances of:

BashOperator(
    task_id='test0',
    bash_command="echo 'test'",
    dag=dag)

executed concurrently spikes my worker machine to from ~150mb to ~3gb 
(eventlet) or ~3.5gb (prefork) memory and takes ~50 seconds. Is this an 
expected artifact of the 20x python executions or is there some way to reduce 
this?

Thanks,
Aaron

Reply via email to