Kevin Yuen created AIRFLOW-513:
Summary: ExternalTaskSensor tasks should not count towards
Project: Apache Airflow
Issue Type: Improvement
Environment: Ubuntu 14.04
Reporter: Kevin Yuen
We are using airflow version 1.7.0 and we are using `ExternalTaskSensor` pretty
heavily to manage dependencies between our DAGs.
We have recently experienced a case where the external task sensors are causing
the DAGs to go into limbo state because they took up all the execution slots
defined via `AIRFLOW__CORE__PARALLELISM`.
Given we have 2 DAGs:
first one with 16 python operator tasks, and the other with 16 sensors. We set
`PARALLELISM` to 16.
If the scheduler choses to schedule all 16 sensors first, the dag runs will
There are a couple of work around to this:
#. staggering the DAGs so that the first dag with python operator runs first
#. lowering the TaskSensor timeout thresholds and relying on retries
Both of these options seems unideal to us and we wonder if `ExternalTaskSensor`
should really be counting towards the `PARALLELISM` limit?
This message was sent by Atlassian JIRA