[jira] [Commented] (AIRFLOW-2001) Make sensors relinquish their execution slots

2018-10-05 Thread Yati (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640156#comment-16640156
 ] 

Yati commented on AIRFLOW-2001:
---

Thanks! Nice use of non-local returns via an exception. I was wondering why an 
exception, but the note about custom implementations being able to do this 
makes sense. What I had in mind was to use the return code of the job process 
and have a code for signaling exactly what this exception signals. Thanks for 
fixing this!

> Make sensors relinquish their execution slots
> -
>
> Key: AIRFLOW-2001
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2001
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: db, scheduler
>Reporter: Yati
>Assignee: Yati
>Priority: Major
>
> A sensor task instance should not take up an execution slot for the entirety 
> of its lifetime (as is currently the case). Indeed, for reasons outlined 
> below, it would be better if sensor execution was preempted by the scheduler 
> by parking it away from the slot till the next poll.
>  Some sensors sense for a condition to be true which is affected only by an 
> external party (e.g., materialization by external means of certain rows in a 
> table). By external, I mean external to the Airflow installation in question, 
> such that the producing entity itself does not need an execution slot in an 
> Airflow pool. If all sensors and their dependencies were of this nature, 
> there would be no issue. Unfortunately, a lot of real world DAGs have sensor 
> dependencies on results produced by another task, typically in some other 
> DAG, but scheduled by the same Airflow scheduler.
> Consider a simple example (arrow direction represents "must happen before", 
> just like in Airflow): DAG1(a >> b) and DAG2(c:sensor(DAG1.b) >> d). In other 
> words, The opening task c of the second dag has a sensor dependency on the 
> ending task b of the first dag. Imagine we have a single pool with 10 
> execution slots, and somehow task instances for c fill up the pool, while the 
> corresponding task instances of DAG1.b have not had a chance to execute (in 
> the real world this happens because of, say, back-fills or reprocesses by 
> clearing those sensors instances and their upstream). This is a deadlock 
> situation, since no progress can be made here – the sensors have filled up 
> the pool waiting on tasks that themselves will never get a chance to run. 
> This problem has been [acknowledged 
> here|https://cwiki.apache.org/confluence/display/AIRFLOW/Common+Pitfalls]
> One way (suggested by Fokko) to solve this is to always run sensors on their 
> pool, and to be careful with the concurrency settings of sensor tasks. This 
> is what a lot of users do now, but there are better solutions to this. Since 
> all the sensor interface allows for is a poll, we can, after each poll, 
> "park" the sensor's execution slot and yield it to other tasks. In the above 
> scenario, there would be no "filling up" of the pool by sensors tasks, as 
> they will be polled, determined to be still unfulfilled, and then parked 
> away, thereby giving a chance to other tasks.
> This would likely have some changes to the DB, and of course to the scheduler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2001) Make sensors relinquish their execution slots

2018-10-05 Thread Stefan Seelmann (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640110#comment-16640110
 ] 

Stefan Seelmann commented on AIRFLOW-2001:
--

Yes, this issue requests what was implemented in AIRFLOW-2747. [~Fokko] could 
you please also close this one?

> Make sensors relinquish their execution slots
> -
>
> Key: AIRFLOW-2001
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2001
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: db, scheduler
>Reporter: Yati
>Assignee: Yati
>Priority: Major
>
> A sensor task instance should not take up an execution slot for the entirety 
> of its lifetime (as is currently the case). Indeed, for reasons outlined 
> below, it would be better if sensor execution was preempted by the scheduler 
> by parking it away from the slot till the next poll.
>  Some sensors sense for a condition to be true which is affected only by an 
> external party (e.g., materialization by external means of certain rows in a 
> table). By external, I mean external to the Airflow installation in question, 
> such that the producing entity itself does not need an execution slot in an 
> Airflow pool. If all sensors and their dependencies were of this nature, 
> there would be no issue. Unfortunately, a lot of real world DAGs have sensor 
> dependencies on results produced by another task, typically in some other 
> DAG, but scheduled by the same Airflow scheduler.
> Consider a simple example (arrow direction represents "must happen before", 
> just like in Airflow): DAG1(a >> b) and DAG2(c:sensor(DAG1.b) >> d). In other 
> words, The opening task c of the second dag has a sensor dependency on the 
> ending task b of the first dag. Imagine we have a single pool with 10 
> execution slots, and somehow task instances for c fill up the pool, while the 
> corresponding task instances of DAG1.b have not had a chance to execute (in 
> the real world this happens because of, say, back-fills or reprocesses by 
> clearing those sensors instances and their upstream). This is a deadlock 
> situation, since no progress can be made here – the sensors have filled up 
> the pool waiting on tasks that themselves will never get a chance to run. 
> This problem has been [acknowledged 
> here|https://cwiki.apache.org/confluence/display/AIRFLOW/Common+Pitfalls]
> One way (suggested by Fokko) to solve this is to always run sensors on their 
> pool, and to be careful with the concurrency settings of sensor tasks. This 
> is what a lot of users do now, but there are better solutions to this. Since 
> all the sensor interface allows for is a poll, we can, after each poll, 
> "park" the sensor's execution slot and yield it to other tasks. In the above 
> scenario, there would be no "filling up" of the pool by sensors tasks, as 
> they will be polled, determined to be still unfulfilled, and then parked 
> away, thereby giving a chance to other tasks.
> This would likely have some changes to the DB, and of course to the scheduler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2001) Make sensors relinquish their execution slots

2018-10-05 Thread Iuliia Volkova (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639412#comment-16639412
 ] 

Iuliia Volkova commented on AIRFLOW-2001:
-

[~seelmann], [~ysagade], could we close this issue? Does it solved?

> Make sensors relinquish their execution slots
> -
>
> Key: AIRFLOW-2001
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2001
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: db, scheduler
>Reporter: Yati
>Assignee: Yati
>Priority: Major
>
> A sensor task instance should not take up an execution slot for the entirety 
> of its lifetime (as is currently the case). Indeed, for reasons outlined 
> below, it would be better if sensor execution was preempted by the scheduler 
> by parking it away from the slot till the next poll.
>  Some sensors sense for a condition to be true which is affected only by an 
> external party (e.g., materialization by external means of certain rows in a 
> table). By external, I mean external to the Airflow installation in question, 
> such that the producing entity itself does not need an execution slot in an 
> Airflow pool. If all sensors and their dependencies were of this nature, 
> there would be no issue. Unfortunately, a lot of real world DAGs have sensor 
> dependencies on results produced by another task, typically in some other 
> DAG, but scheduled by the same Airflow scheduler.
> Consider a simple example (arrow direction represents "must happen before", 
> just like in Airflow): DAG1(a >> b) and DAG2(c:sensor(DAG1.b) >> d). In other 
> words, The opening task c of the second dag has a sensor dependency on the 
> ending task b of the first dag. Imagine we have a single pool with 10 
> execution slots, and somehow task instances for c fill up the pool, while the 
> corresponding task instances of DAG1.b have not had a chance to execute (in 
> the real world this happens because of, say, back-fills or reprocesses by 
> clearing those sensors instances and their upstream). This is a deadlock 
> situation, since no progress can be made here – the sensors have filled up 
> the pool waiting on tasks that themselves will never get a chance to run. 
> This problem has been [acknowledged 
> here|https://cwiki.apache.org/confluence/display/AIRFLOW/Common+Pitfalls]
> One way (suggested by Fokko) to solve this is to always run sensors on their 
> pool, and to be careful with the concurrency settings of sensor tasks. This 
> is what a lot of users do now, but there are better solutions to this. Since 
> all the sensor interface allows for is a poll, we can, after each poll, 
> "park" the sensor's execution slot and yield it to other tasks. In the above 
> scenario, there would be no "filling up" of the pool by sensors tasks, as 
> they will be polled, determined to be still unfulfilled, and then parked 
> away, thereby giving a chance to other tasks.
> This would likely have some changes to the DB, and of course to the scheduler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (AIRFLOW-2001) Make sensors relinquish their execution slots

2018-09-21 Thread Stefan Seelmann (JIRA)


[ 
https://issues.apache.org/jira/browse/AIRFLOW-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16624060#comment-16624060
 ] 

Stefan Seelmann commented on AIRFLOW-2001:
--

AIRFLOW-2747 is merged to master which should also solve this issue.

> Make sensors relinquish their execution slots
> -
>
> Key: AIRFLOW-2001
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2001
> Project: Apache Airflow
>  Issue Type: Bug
>  Components: db, scheduler
>Reporter: Yati
>Assignee: Yati
>Priority: Major
>
> A sensor task instance should not take up an execution slot for the entirety 
> of its lifetime (as is currently the case). Indeed, for reasons outlined 
> below, it would be better if sensor execution was preempted by the scheduler 
> by parking it away from the slot till the next poll.
>  Some sensors sense for a condition to be true which is affected only by an 
> external party (e.g., materialization by external means of certain rows in a 
> table). By external, I mean external to the Airflow installation in question, 
> such that the producing entity itself does not need an execution slot in an 
> Airflow pool. If all sensors and their dependencies were of this nature, 
> there would be no issue. Unfortunately, a lot of real world DAGs have sensor 
> dependencies on results produced by another task, typically in some other 
> DAG, but scheduled by the same Airflow scheduler.
> Consider a simple example (arrow direction represents "must happen before", 
> just like in Airflow): DAG1(a >> b) and DAG2(c:sensor(DAG1.b) >> d). In other 
> words, The opening task c of the second dag has a sensor dependency on the 
> ending task b of the first dag. Imagine we have a single pool with 10 
> execution slots, and somehow task instances for c fill up the pool, while the 
> corresponding task instances of DAG1.b have not had a chance to execute (in 
> the real world this happens because of, say, back-fills or reprocesses by 
> clearing those sensors instances and their upstream). This is a deadlock 
> situation, since no progress can be made here – the sensors have filled up 
> the pool waiting on tasks that themselves will never get a chance to run. 
> This problem has been [acknowledged 
> here|https://cwiki.apache.org/confluence/display/AIRFLOW/Common+Pitfalls]
> One way (suggested by Fokko) to solve this is to always run sensors on their 
> pool, and to be careful with the concurrency settings of sensor tasks. This 
> is what a lot of users do now, but there are better solutions to this. Since 
> all the sensor interface allows for is a poll, we can, after each poll, 
> "park" the sensor's execution slot and yield it to other tasks. In the above 
> scenario, there would be no "filling up" of the pool by sensors tasks, as 
> they will be polled, determined to be still unfulfilled, and then parked 
> away, thereby giving a chance to other tasks.
> This would likely have some changes to the DB, and of course to the scheduler.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)