[ 
https://issues.apache.org/jira/browse/AIRFLOW-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16640110#comment-16640110
 ] 

Stefan Seelmann commented on AIRFLOW-2001:
------------------------------------------

Yes, this issue requests what was implemented in AIRFLOW-2747. [~Fokko] could 
you please also close this one?

> Make sensors relinquish their execution slots
> ---------------------------------------------
>
>                 Key: AIRFLOW-2001
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2001
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: db, scheduler
>            Reporter: Yati
>            Assignee: Yati
>            Priority: Major
>
> A sensor task instance should not take up an execution slot for the entirety 
> of its lifetime (as is currently the case). Indeed, for reasons outlined 
> below, it would be better if sensor execution was preempted by the scheduler 
> by parking it away from the slot till the next poll.
>  Some sensors sense for a condition to be true which is affected only by an 
> external party (e.g., materialization by external means of certain rows in a 
> table). By external, I mean external to the Airflow installation in question, 
> such that the producing entity itself does not need an execution slot in an 
> Airflow pool. If all sensors and their dependencies were of this nature, 
> there would be no issue. Unfortunately, a lot of real world DAGs have sensor 
> dependencies on results produced by another task, typically in some other 
> DAG, but scheduled by the same Airflow scheduler.
> Consider a simple example (arrow direction represents "must happen before", 
> just like in Airflow): DAG1(a >> b) and DAG2(c:sensor(DAG1.b) >> d). In other 
> words, The opening task c of the second dag has a sensor dependency on the 
> ending task b of the first dag. Imagine we have a single pool with 10 
> execution slots, and somehow task instances for c fill up the pool, while the 
> corresponding task instances of DAG1.b have not had a chance to execute (in 
> the real world this happens because of, say, back-fills or reprocesses by 
> clearing those sensors instances and their upstream). This is a deadlock 
> situation, since no progress can be made here – the sensors have filled up 
> the pool waiting on tasks that themselves will never get a chance to run. 
> This problem has been [acknowledged 
> here|https://cwiki.apache.org/confluence/display/AIRFLOW/Common+Pitfalls]
> One way (suggested by Fokko) to solve this is to always run sensors on their 
> pool, and to be careful with the concurrency settings of sensor tasks. This 
> is what a lot of users do now, but there are better solutions to this. Since 
> all the sensor interface allows for is a poll, we can, after each poll, 
> "park" the sensor's execution slot and yield it to other tasks. In the above 
> scenario, there would be no "filling up" of the pool by sensors tasks, as 
> they will be polled, determined to be still unfulfilled, and then parked 
> away, thereby giving a chance to other tasks.
> This would likely have some changes to the DB, and of course to the scheduler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to