GitHub user FelipeRamos-neuro edited a comment on the discussion: Expand 
dataset (asset) scheduling to tasks as well, changing the dag run to a deferred 
state while it awaits for events triggered externally

A use case would be something like this:
You want to automate the execution of multiple applications with a dag, but all 
of them work using different resources and maybe even different 
infrastructures, one is a simple call to an API, the other is an ETL pipeline 
that runs on external infrastructure provided by a client (that by all accounts 
we don't necessarily know when is it going to finish it) and the last one is a 
data validation that we run on our own infrastructure. All sequentially and 
using previous task outputs for the next task.

In such a case, the dag would have a sensor to monitor the ETL pipeline, but 
since it sensors are traditionally pull-based, it would require the dag_run to 
be in a state of running and the sensor would also be running indefinitely on 
the triggerer component. If instead of a sensor, I could define an asset that 
could receive push-based events from the Airflow REST API, passing for example: 
dag_id, run_id and task_id as arguments, and set the dag_run to some sort of 
deferred state until it can resume execution from that push-based event. This 
would possibly reduce, in certain scenarios, resource usage as it frees worker 
slots and process inefficiencies such as relying on a pull-based mechanism for 
monitoring.

GitHub link: 
https://github.com/apache/airflow/discussions/44816#discussioncomment-11523580

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to