[jira] [Commented] (AIRFLOW-4747) Airflow Scheduling and DAG Parsing

Ash Berlin-Taylor (JIRA) Fri, 07 Jun 2019 08:50:20 -0700


    [ 
https://issues.apache.org/jira/browse/AIRFLOW-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858753#comment-16858753
 ]


Ash Berlin-Taylor commented on AIRFLOW-4747:
--------------------------------------------

AIRFLOW-2761 (PR: https://github.com/apache/airflow/pull/4234/files) which 
landed in 1.10.3 might help things a bit - depending exactly what the slow bit 
is. (Check out the graphs in the PR)

> Airflow Scheduling and DAG Parsing
> ----------------------------------
>
>                 Key: AIRFLOW-4747
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-4747
>             Project: Apache Airflow
>          Issue Type: Wish
>          Components: scheduler
>    Affects Versions: 1.10.2
>            Reporter: Michael Smith
>            Priority: Major
>
> I read somewhere that there was going to be an attempt to decouple Airflow's 
> DAG  parsing from its scheduler function. My assumption would be that this 
> could be achieved, for example, by driving Scheduler actions (almost?) 
> entirely from the Airflow database. This would eliminate the need for a 
> continuously running DAG parse process?
> At present we observe significant lag and significant overheads with the 
> current (1.10.2) model of scheduling which appears to be heavily coupled with 
> the DAG parse. In our environment DAG parse times are typically >1 sec per 
> DAG. This means a single DAG parse cycle can take several minutes. DAG 
> parsing is a large CPU overhead (on a single node cloud VM we've been forced 
> to allocate 2 cpu nodes for example). In addition production jobs suffer from 
> fairly large lag times between tasks (time between task end and start of 
> follow on task). This can be in the order of minutes even when task slots are 
> available.
>  
> Is anyone working on this enhancement or could provide guidance on resolving 
> (possibly a configuration issue our side, but I have experimented with 
> configuration options extensively).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (AIRFLOW-4747) Airflow Scheduling and DAG Parsing

Reply via email to