Kevin Yuen created AIRFLOW-513:

             Summary: ExternalTaskSensor tasks should not count towards 
parallelism limit
                 Key: AIRFLOW-513
             Project: Apache Airflow
          Issue Type: Improvement
         Environment: Ubuntu 14.04
Version 1.7.0

            Reporter: Kevin Yuen


We are using airflow version 1.7.0 and we are using `ExternalTaskSensor` pretty 
heavily to manage dependencies between our DAGs. 

We have recently experienced a case where the external task sensors are causing 
the DAGs to go into limbo state because they took up all the execution slots 

For example: 
Given we have 2 DAGs: 
first one with 16 python operator tasks, and the other with 16 sensors. We set 

If the scheduler choses to schedule all 16 sensors first, the dag runs will 
never complete. 

There are a couple of work around to this:
#. staggering the DAGs so that the first dag with python operator runs first
#. lowering the TaskSensor timeout thresholds and relying on retries

Both of these options seems unideal to us and we wonder if `ExternalTaskSensor` 
should really be counting towards the `PARALLELISM` limit?


This message was sent by Atlassian JIRA

Reply via email to