ashb commented on a change in pull request #3830: [AIRFLOW-2156] Parallelize Celery Executor URL: https://github.com/apache/incubator-airflow/pull/3830#discussion_r215554463
########## File path: airflow/config_templates/default_airflow.cfg ########## @@ -380,6 +380,9 @@ flower_port = 5555 # Default queue that tasks get assigned to and that worker listen on. default_queue = default +# How many processes CeleryExecutor uses to sync task state. +sync_parallelism = 16 Review comment: Does multiprocessing do a fork, or an exec of a whole new python process? If it's exec then the memory consumption is quite drasticly higher, and this should default to 1 or 2. If it's a fork and we can take advantage of COW to reduce memory then I guess this is okay. Still probably worth a note in UPDATING.md about this new setting and how people should tweak it based on their scheduler node size. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
