GitHub user FelipeRamos-neuro edited a discussion: Expand dataset (asset) 
scheduling to tasks as well, changing the dag run to a deferred state while it 
awaits for events triggered externally

In the company that I work, we encountered some issues when trying to execute a 
dag in a manner that could truly support asynchronous execution based on events 
generated by external services. We devised a strategy using deferrable 
operators, a table to store state changes between tasks, a sensor to check for 
state changes and a callback API endpoint for the external services to change 
the state of the task at event generation. I haven't experimented as much with 
data-aware scheduling, but as I understand it, right now, it is focused on 
scheduling and triggering of dag_runs. The idea hinges on introducing some sort 
of deferred state for dag_runs, where we could, maybe, use the Airflow REST API 
endpoint for queued events to signal Airflow, from external services, that the 
dag run can continue execution. I'm not fully aware of how many complexities 
this could add to the scheduling process, but I think that after the release of 
Airflow 3.0, this could be a good feature to focus on.

GitHub link: https://github.com/apache/airflow/discussions/44816

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to