eladkal opened a new issue, #30974:
URL: https://github.com/apache/airflow/issues/30974

   ### Body
   
   **The current state:** If DAG is defined with 4 datasets. Airflow will wait 
for all of them to be ready before scheduling the DAG. This works well and 
serve use cases where all 4 datasets are curial and must be ready.
   
   **The use case we don't currently handle:** It is common for datasets not to 
be equally important.  Sometimes the core datasets are ready yet some minor 
ones are not (for example if one of the datasets is used as enrichment) in that 
case DAG author may want to define "grace period" which means how much time he 
is willing to continue to wait before DAG should be scheduled regardless if 
dataset is ready or not. With pipeline sometimes "good enough" is OK. The worst 
that can happen is that one major pipeline (which has wide downstream depended 
DAGs) is stuck on some minor dataset.
   
   **Suggested ideas:**
   1. Introduce the ability to skip dataset dependency check after grace period 
has passed.
   2. Add `DatasetSensor`? That can be used as a workaround (minor dependencies 
can be set within the DAG and not be used with the dataset feature) I don't 
like this one so much but this is an option.
   
   (Inspired by 
https://apache-airflow.slack.com/archives/CCQ7EGB1P/p1682791720916639 )
   
   
   ### Committer
   
   - [X] I acknowledge that I am a maintainer/committer of the Apache Airflow 
project.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to