eladkal opened a new issue, #30974: URL: https://github.com/apache/airflow/issues/30974
### Body **The current state:** If DAG is defined with 4 datasets. Airflow will wait for all of them to be ready before scheduling the DAG. This works well and serve use cases where all 4 datasets are curial and must be ready. **The use case we don't currently handle:** It is common for datasets not to be equally important. Sometimes the core datasets are ready yet some minor ones are not (for example if one of the datasets is used as enrichment) in that case DAG author may want to define "grace period" which means how much time he is willing to continue to wait before DAG should be scheduled regardless if dataset is ready or not. With pipeline sometimes "good enough" is OK. The worst that can happen is that one major pipeline (which has wide downstream depended DAGs) is stuck on some minor dataset. **Suggested ideas:** 1. Introduce the ability to skip dataset dependency check after grace period has passed. 2. Add `DatasetSensor`? That can be used as a workaround (minor dependencies can be set within the DAG and not be used with the dataset feature) I don't like this one so much but this is an option. (Inspired by https://apache-airflow.slack.com/archives/CCQ7EGB1P/p1682791720916639 ) ### Committer - [X] I acknowledge that I am a maintainer/committer of the Apache Airflow project. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
