[ 
https://issues.apache.org/jira/browse/AIRFLOW-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16176991#comment-16176991
 ] 

ASF subversion and git services commented on AIRFLOW-1627:
----------------------------------------------------------

Commit 601045027212b0fdd9899d1eec0dfa438ecb0450 in incubator-airflow's branch 
refs/heads/master from [~dxhuang]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=6010450 ]

[AIRFLOW-1627] Only query pool in SubDAG init when necessary

When checking for pool conflicts in a SubDAG, ensure that a task in
the SubDAG is actually in the same pool as the SubDagOperator itself
to avoid querying the database unnecessarily.

Closes #2620 from dhuang/AIRFLOW-1627


> SubDagOperator initialization should only query pools when necessary
> --------------------------------------------------------------------
>
>                 Key: AIRFLOW-1627
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1627
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: operators, subdag
>            Reporter: Daniel Huang
>            Assignee: Daniel Huang
>            Priority: Minor
>             Fix For: 1.9.0
>
>
> If a SubDagOperator is assigned to a pool, it queries db for pool info to 
> ensure there is no pool conflict with one of its tasks when only 1 slot 
> remains. However, we should check that there's a possible conflict (a task in 
> the subdag is in the same pool as the subdag) before actually querying for 
> pools.
> I have a DAG with hundreds of subdags and I found that the pool conflict 
> check was taking up a fair chunk of time when processing the DAG file.
> Relevant code: 
> https://github.com/apache/incubator-airflow/blob/a81c153cc48e4c99a9e0a5047990b84c5d07e3cb/airflow/operators/subdag_operator.py#L60-L81



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to