[ https://issues.apache.org/jira/browse/AIRFLOW-513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kevin Yuen updated AIRFLOW-513: ------------------------------- Description: Hi, We are using airflow version 1.7.0 and we are using `ExternalTaskSensor` pretty heavily to manage dependencies between our DAGs. We have recently experienced a case where the external task sensors are causing the DAGs to go into limbo state because they took up all the execution slots defined via `AIRFLOW__CORE__PARALLELISM`. For example: Given we have 2 DAGs: first one with 16 python operator tasks, and the other with 16 sensors. We set `PARALLELISM` to 16. If the scheduler choses to schedule all 16 sensors first, the dag runs will never complete. There are a couple of work around to this: # staggering the DAGs so that the first dag with python operator runs first # lowering the TaskSensor timeout thresholds and relying on retries Both of these options seems unideal to us and we wonder if `ExternalTaskSensor` should really be counting towards the `PARALLELISM` limit? Cheers, Kevin was: Hi, We are using airflow version 1.7.0 and we are using `ExternalTaskSensor` pretty heavily to manage dependencies between our DAGs. We have recently experienced a case where the external task sensors are causing the DAGs to go into limbo state because they took up all the execution slots defined via `AIRFLOW__CORE__PARALLELISM`. For example: Given we have 2 DAGs: first one with 16 python operator tasks, and the other with 16 sensors. We set `PARALLELISM` to 16. If the scheduler choses to schedule all 16 sensors first, the dag runs will never complete. There are a couple of work around to this: # staggering the DAGs so that the first dag with python operator runs first # lowering the TaskSensor timeout thresholds and relying on retries Both of these options seems unideal to us and we wonder if `ExternalTaskSensor` should really be counting towards the `PARALLELISM` limit? Cheers, Kevin > ExternalTaskSensor tasks should not count towards parallelism limit > ------------------------------------------------------------------- > > Key: AIRFLOW-513 > URL: https://issues.apache.org/jira/browse/AIRFLOW-513 > Project: Apache Airflow > Issue Type: Improvement > Environment: Ubuntu 14.04 > Version 1.7.0 > Reporter: Kevin Yuen > > Hi, > We are using airflow version 1.7.0 and we are using `ExternalTaskSensor` > pretty heavily to manage dependencies between our DAGs. > We have recently experienced a case where the external task sensors are > causing the DAGs to go into limbo state because they took up all the > execution slots defined via `AIRFLOW__CORE__PARALLELISM`. > For example: > Given we have 2 DAGs: > first one with 16 python operator tasks, and the other with 16 sensors. > We set `PARALLELISM` to 16. > If the scheduler choses to schedule all 16 sensors first, the dag runs > will never complete. > There are a couple of work around to this: > # staggering the DAGs so that the first dag with python operator runs first > # lowering the TaskSensor timeout thresholds and relying on retries > Both of these options seems unideal to us and we wonder if > `ExternalTaskSensor` should really be counting towards the `PARALLELISM` > limit? > Cheers, > Kevin -- This message was sent by Atlassian JIRA (v6.3.4#6332)