Hello all, Our automation with airflow is getting bigger and bigger (airflow 1.8, ~150 DAGs, 3xinstances of scheduler) . Sometimes our users are triggering DAG runs based on some external events, so we exposed an API endpoint to run a DAG. Those DAGs that are run manually should give fast feedback to the user, but we see that it takes few minutes to schedule first task, and often next few minutes between tasks. So the most time is consumed between tasks, task durations are just some seconds. Does anybody have those issues? It looks like scheduler often have empty loops with logs like: 2018-04-04 12:05:45,004:DEBUG:airflow.jobs.SchedulerJob:[CT=None] Starting Loop... 2018-04-04 12:05:45,005:INFO:airflow.jobs.SchedulerJob:[CT=None] Heartbeating the process manager 2018-04-04 12:05:45,005:INFO:airflow.jobs.SchedulerJob:[CT=None] Heartbeating the executor 2018-04-04 12:05:45,005:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] 44 running task instances 2018-04-04 12:05:45,005:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] 0 in queue 2018-04-04 12:05:45,006:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] 340 open slots 2018-04-04 12:05:45,006:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] Calling the <class 'airflow.executors.celery_executor.CeleryExecutor'> sync method 2018-04-04 12:05:45,006:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] Inquiring about 44 celery task(s) 2018-04-04 12:05:45,744:DEBUG:airflow.jobs.SchedulerJob:[CT=None] Ran scheduling loop in 0.74s 2018-04-04 12:05:45,745:DEBUG:airflow.jobs.SchedulerJob:[CT=None] Sleeping for 1.00s
Maybe we need to tune airflow settings? We have up to 250 unacked messages on rabbit queue, which translates to number of running task instances, there is a lot going on in our airflow instance but apart from that scheduling issue everything looks fine (cpu/memory usage, etc). Our general settings: 6x dockers with workers, parallelism is 384, dag concurrency 128 and celeryd_concurrency 64 Our scheduler config section: job_heartbeat_sec = 5 scheduler_heartbeat_sec = 5 max_threads = 2 thanks mC I am an Intel employee. All comments and opinions are my own and do not represent the views of Intel. -------------------------------------------------------------------- Intel Technology Poland sp. z o.o. ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN. Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest zabronione. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by others is strictly prohibited.
