[ https://issues.apache.org/jira/browse/AIRFLOW-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15565966#comment-15565966 ]
Laura Lorenz commented on AIRFLOW-513: -------------------------------------- I think that sensors, including ExternalTaskSensors, should be counted in the core paralellism limit; they are after all using resources. I think the resource bottlenecks long-waiting ExternalTaskSensors can create should be managed with pools/queues or actual DAG dependencies. To the latter point I think in your case you may want to use the TriggerDagRunOperator instead so you are pushing to create a DagRun instance for DAG 2, instead of polling from DAG 2 for DAG 1. We do use some ExternalTaskSensors in the way you describe but we increase worker limits each time we add one, and in any case we have only 4 running at a time, so it hasn't reached your level of contention yet. > ExternalTaskSensor tasks should not count towards parallelism limit > ------------------------------------------------------------------- > > Key: AIRFLOW-513 > URL: https://issues.apache.org/jira/browse/AIRFLOW-513 > Project: Apache Airflow > Issue Type: Improvement > Environment: Ubuntu 14.04 > Version 1.7.0 > Reporter: Kevin Yuen > > Hi, > We are using airflow version 1.7.0 and we are using `ExternalTaskSensor` > pretty heavily to manage dependencies between our DAGs. > We have recently experienced a case where the external task sensors are > causing the DAGs to go into limbo state because they took up all the > execution slots defined via `AIRFLOW__CORE__PARALLELISM`. > For example: > Given we have 2 DAGs: > first one with 16 python operator tasks, and the other with 16 sensors. > We set `PARALLELISM` to 16. > If the scheduler choses to schedule all 16 sensors first, the dag runs > will never complete. > There are a couple of work around to this: > # staggering the DAGs so that the first dag with python operator runs first > # lowering the TaskSensor timeout thresholds and relying on retries > Both of these options seems less then ideal to us. We wonder if > `ExternalTaskSensor` should really be counting towards the `PARALLELISM` > limit? > Cheers, > Kevin -- This message was sent by Atlassian JIRA (v6.3.4#6332)