Hello all,

Our automation with airflow is getting bigger and bigger (airflow 1.8, ~150 
DAGs, 3xinstances of scheduler) . Sometimes our users are triggering DAG runs 
based on some external events, so we exposed an API endpoint to run a DAG. 
Those DAGs that are run manually should give fast feedback to the user, but we 
see that it takes few minutes to schedule first task, and often next few 
minutes between tasks. So the most time is consumed between tasks, task 
durations are just some seconds. Does anybody have those issues? It looks like 
scheduler often have empty loops with logs like:
2018-04-04 12:05:45,004:DEBUG:airflow.jobs.SchedulerJob:[CT=None] Starting 
Loop...
2018-04-04 12:05:45,005:INFO:airflow.jobs.SchedulerJob:[CT=None] Heartbeating 
the process manager
2018-04-04 12:05:45,005:INFO:airflow.jobs.SchedulerJob:[CT=None] Heartbeating 
the executor
2018-04-04 
12:05:45,005:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] 
44 running task instances
2018-04-04 
12:05:45,005:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] 0 
in queue
2018-04-04 
12:05:45,006:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] 
340 open slots
2018-04-04 
12:05:45,006:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] 
Calling the <class 'airflow.executors.celery_executor.CeleryExecutor'> sync 
method
2018-04-04 
12:05:45,006:DEBUG:airflow.executors.celery_executor.CeleryExecutor:[CT=None] 
Inquiring about 44 celery task(s)
2018-04-04 12:05:45,744:DEBUG:airflow.jobs.SchedulerJob:[CT=None] Ran 
scheduling loop in 0.74s
2018-04-04 12:05:45,745:DEBUG:airflow.jobs.SchedulerJob:[CT=None] Sleeping for 
1.00s

Maybe we need to tune airflow settings?
We have up to 250 unacked messages on rabbit queue, which translates to number 
of running task instances, there is a lot going on in our airflow instance but 
apart from that scheduling issue everything looks fine (cpu/memory usage, etc).
Our general settings:
6x dockers with workers, parallelism is 384, dag concurrency 128 and 
celeryd_concurrency 64

Our scheduler config section:
job_heartbeat_sec = 5
scheduler_heartbeat_sec = 5
max_threads = 2


thanks
mC


I am an Intel employee. All comments and opinions are my own and do not 
represent the views of Intel.


--------------------------------------------------------------------

Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial 
Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | 
Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i 
moze zawierac informacje poufne. W razie przypadkowego otrzymania tej 
wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; 
jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole 
use of the intended recipient(s). If you are not the intended recipient, please 
contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.

Reply via email to