Raphael Lopez Kaufman created AIRFLOW-1398:
----------------------------------------------

             Summary: Add ability for ExternalTaskSensor to wait on multiple 
runs of a task
                 Key: AIRFLOW-1398
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1398
             Project: Apache Airflow
          Issue Type: Improvement
            Reporter: Raphael Lopez Kaufman


Currently using the execution_date_fn parameter of the ExternalTaskSensor 
sensors only allows to wait for the completion of one given run of the task the 
ExternalTaskSensor is sensing.

However, this prevents users to have setups where dags don't have the same 
schedule frequency but still depend on one another. For example, let's say you 
have a dag scheduled hourly that transforms log data and is owned by the team 
in charge of logging. In the current setup you cannot have other higher level 
teams, that want to use this transformed data, create dags processing 
transformed log data in daily batches, while making sure the logged transformed 
data was properly created. Note that simply waiting for the data to be present 
(using e.g. the HivePartitionSensor if the data is in hive) might not be 
satisfactory because the data being present doesn't mean it is ready to be used.

Adding the ability for an ExternalTaskSensor to wait for multiple runs of the 
task it is sensing to have finished would allow higher level teams to setup 
dags with an ExternalTaskSensor sensing the end task of the dag that transforms 
the log data and to wait for the successful completion of 24 of its hourly runs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to