[ 
https://issues.apache.org/jira/browse/AIRFLOW-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430399#comment-16430399
 ] 

Máté Szabó commented on AIRFLOW-2128:
-------------------------------------


{code:java}
min_file_process_interval = 0
{code}

I believe this is the default setting.

> 'Tall' DAGs scale worse than 'wide' DAGs
> ----------------------------------------
>
>                 Key: AIRFLOW-2128
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2128
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: DAG, DagRun, scheduler
>    Affects Versions: 1.9.0
>            Reporter: Máté Szabó
>            Priority: Major
>              Labels: performance, usability
>         Attachments: tall_dag.py, wide_dag.py
>
>
> Tall DAG = a DAG with long chains of dependencies, e.g.: 0 -> 1 -> 2 -> ... 
> -> 998 -> 999
>  Wide DAG = a DAG with many short, parallel dependencies e.g. 0 -> 1; 0 -> 2; 
> ... 0 -> 999
> Take a super simple case where both graphs are of 1000 tasks, and all the 
> tasks are just "sleep 0.03" bash commands (see the attached files).
>  With the default SequentialExecutor (without paralellism), I would expect my 
> 2 example DAGs to take (approximately) the same time to run, but apparently 
> this is not the case.
> For the wide DAG it was about 80 successfully executed tasks in 10 minutes, 
> for the tall one it was 0.
> This anomaly also seem to affect the web UI. Opening up the graph view or the 
> tree view for the wide DAG takes about 6 seconds on my machine, but for the 
> tall one it takes significantly longer, in fact currently it does not load at 
> all.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to