jialerchew opened a new issue, #31013: URL: https://github.com/apache/airflow/issues/31013
### Description **The current state:** Example, if DAG is defined with 3 Datasets. Once 3 datasets are updated at least once, Airflow will schedule the DAG. This is pretty good for most use cases. However, in the event if one of the Dataset Producers stopped working and missed/failed some of its supposed runs, the downstream DAG shouldn't be executed anymore, and the most recent Dataset should be made invalid. An quick way to implement this would be a concept of "freshness", where the Dataset will only be valid for a period of time. Using the example below, if `example_dataset_2` is already 24 hours old, and there is a "freshness" threshold of **12 hours**, I dont want `example_dataset_2` to be counted as “updated” anymore. Hence, when `example_dataset_3` updates, the DAG will still not be triggered, because `example_dataset_2` has already passed the 12-hour-freshness threshold.  ### Use case/motivation My team is trying to move towards "reactive" DAGs, where we don't want to schedule downstreams DAGs and use sensors. This is because we are trying to reduce redundant DAG executions, and it's easier for the team to manually retrigger failed DAG runs. (Just trigger upstream, and it will automatically run downstream DAGs) Datasets is the perfect use case for us; however we are not completely comfortable to switch towards Datasets because it doesn't protect us from outdated Datasets. Regular scheduling + sensors combo do not face this issue because that method always refers to the exact task_id defined from `execution_delta`. We don't need such precision, just a way to measure "freshness" of Datasets would be good enough. Inspired from this [thread](https://apache-airflow.slack.com/archives/CCQ7EGB1P/p1682791720916639) on Airflow Slack. ### Related issues Found this [other issue](https://github.com/apache/airflow/issues/30974), which is the complete opposite of this feature request. ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
