GitHub user Saratqpy created a discussion: Proposal: Support "cron AND dataset" condition for DAG scheduling
Hi everyone, I’m working on a use case where I need to run a DAG **only when two conditions are both met**: 1. A specific **cron schedule** is reached (e.g., every 5 minutes) 2. A specific **dataset has been updated** since the last DAG run This is essentially the reverse of the current behavior with `DatasetAll` or `DatasetAny`, where datasets trigger a DAG immediately when ready. What I’m looking for is a way to combine **time-based scheduling and dataset triggering using an AND condition**, so that the DAG only runs when: - The scheduled time (cron tick) has arrived **AND** - The required dataset is updated and ready The motivation behind this is to avoid premature DAG runs — for example, if the dataset is delayed or not produced yet, the DAG should skip the current time slot and wait for the next one where both conditions are satisfied. I've been experimenting with a custom `Timetable` to implement this, but I wanted to first ask: - Has this kind of logic been discussed before? - Is there any plan to support this pattern natively in Airflow’s scheduling system? Would love to hear your thoughts or suggestions for a cleaner way to achieve this. Thanks! GitHub link: https://github.com/apache/airflow/discussions/54095 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
