[ 
https://issues.apache.org/jira/browse/AIRFLOW-6944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bjorn Olsen updated AIRFLOW-6944:
---------------------------------
    Description: 
Current AWS DataSyncOperator attempts to start running a DataSync Task, with no 
regard / check for whether the task is already running or not. This attempt 
will fail (correctly).

It is useful to have capability to optionally allow the operator to "catch up" 
instead of starting the Task - if the Task is of a particular status eg 
'QUEUED' then we might want to wait for the currently Queued one to complete, 
instead of failing or instead of submitting another one (and snowballing). 
For example, this scenario can happen if the task was previously submitted but 
the Airflow Operator timed out waiting for it, when DataSync is busy.

Or, maybe we want to wait for the Queued task to complete and then submit 
another Task anyway...

Allowing the user some options in terms of status management allows for various 
use cases.

However, the current functionality of "Fail if the Task can't be started" 
should remain default, to prevent unintentional problems which can arise if we 
instead decided to always wait if there is already a task queued. For example 
if the previous task has different Include filters than the new task, then 
logically they aren't the same.

 

  was:
Current AWS DataSyncOperator attempts to start running a DataSync Task, with no 
regard / check for whether the task is already running or not.

It is useful to have capability to optionally allow the operator to "catch up" 
instead of starting the Task - if the Task is of a particular status eg 
'QUEUED' then we might want to just use the currently Queued one instead of 
submitting another one. This scenario can happen if the task was previously 
submitted but the operator timed out waiting for it, for example.

Or, maybe we want to wait for the Queued task to complete and then submit 
another Task ...

Allowing the user some options in terms of status management allows for various 
robustness when submitting new task executions.

However, the current functionality of "Fail if the Task can't be started" 
should remain default, to prevent unintentional problems which can arise if we 
instead decided to always wait if there is already a task queued. For example 
if the previous task has different Include filters than the new task, then 
logically they aren't the same.

 


> Allow AWS DataSync to "catch up" when Task is already running
> -------------------------------------------------------------
>
>                 Key: AIRFLOW-6944
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6944
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: aws
>    Affects Versions: 1.10.9
>            Reporter: Bjorn Olsen
>            Assignee: Bjorn Olsen
>            Priority: Minor
>
> Current AWS DataSyncOperator attempts to start running a DataSync Task, with 
> no regard / check for whether the task is already running or not. This 
> attempt will fail (correctly).
> It is useful to have capability to optionally allow the operator to "catch 
> up" instead of starting the Task - if the Task is of a particular status eg 
> 'QUEUED' then we might want to wait for the currently Queued one to complete, 
> instead of failing or instead of submitting another one (and snowballing). 
> For example, this scenario can happen if the task was previously submitted 
> but the Airflow Operator timed out waiting for it, when DataSync is busy.
> Or, maybe we want to wait for the Queued task to complete and then submit 
> another Task anyway...
> Allowing the user some options in terms of status management allows for 
> various use cases.
> However, the current functionality of "Fail if the Task can't be started" 
> should remain default, to prevent unintentional problems which can arise if 
> we instead decided to always wait if there is already a task queued. For 
> example if the previous task has different Include filters than the new task, 
> then logically they aren't the same.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to