Hugo-pao opened a new issue, #52637:
URL: https://github.com/apache/airflow/issues/52637

   ### Description
   
   Context:
   This idea is related to the existing [Asset Scheduling 
Documentation](https://airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/asset-scheduling.html).
 Currently, when multiple DAGs update the same dataset in Airflow, a consumer 
DAG with this dataset as its schedule is triggered every time a producer DAG 
succeeds. However, in certain ETL pipelines, it is desirable to trigger a 
pipeline only after multiple DAGs have completed execution (e.g., ensuring the 
integrity of a database populated by multiple processes).
   
   Problem Statement:
   The current behavior does not support waiting for multiple DAGs to complete 
before triggering a consumer DAG. This limitation complicates scenarios where 
the integrity or completeness of data, updated by multiple processes, needs to 
be ensured before proceeding with further processing.
   
   Current Workaround:
   The only workaround I have found involves dividing the dataset into multiple 
datasets, each serving as outlets for individual producer DAGs. The final 
consumer DAG is then triggered using conditions that check for all these 
datasets. This approach, however, introduces complexity and may not be 
intuitive for all use cases.
   
   Feature Request:
   It would be beneficial to have a feature that allows a consumer DAG to wait 
for all relevant producer DAGs to complete before triggering. This could 
potentially be implemented with a parameter in the schedule, such as 
wait_for_all, to specify that the consumer DAG should only execute after all 
specified datasets have been updated by their respective producers.
   
   Suggested Solution:
   One possible enhancement could involve introducing a new scheduling 
parameter that allows users to specify conditions under which a consumer DAG 
should be triggered, such as waiting for updates from all producer DAGs or a 
specific subset of them.
   
   Additional Information:
   This feature would greatly enhance the flexibility and robustness of 
pipeline orchestration in Airflow, particularly for complex workflows involving 
multiple data integrity checks or dependencies
   
   ### Use case/motivation
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to