GitHub user Leondon9 created a discussion: Support waiting for asset partition
readiness from time-scheduled DAGs
I am looking for guidance before attempting an implementation.
In time-scheduled DW/DataLake DAGs, `data_interval_start` and
`data_interval_end` are often still the primary processing contract.
Asset-triggered DAGs are useful, but switching a heavy batch consumer to
`schedule=[Asset(...)]` changes the run semantics because asset-triggered runs
intentionally do not have a logical date or data interval.
The use case I am trying to model is:
- keep the consumer DAG time-scheduled, for example `@daily`
- keep SQL/business logic based on the consumer's data interval
- wait for an upstream Airflow asset partition, for example `orders /
dt=2026-05-21`
- avoid coupling the consumer to the producer DAG's exact schedule via
`ExternalTaskSensor(execution_date_fn=...)`
Today I can solve this with provider-specific sensors, such as S3, Hive,
BigQuery, or Databricks partition sensors, or with `ExternalTaskSensor`. But I
could not find an Airflow-native task-level way to wait for an `AssetEvent` /
asset partition recorded in Airflow metadata.
A concrete shape might look like this:
```python
orders = Asset("warehouse://orders")
wait_orders = AssetPartitionSensor(
task_id="wait_orders",
asset=orders,
partition_key="{{ data_interval_start | ds }}",
deferrable=True,
)
```
The intended semantics would be:
- the consumer DAG is still created by its time schedule
- the sensor waits for an Airflow asset event matching `asset + partition_key`
- the consumer can continue using its own `data_interval_start` /
`data_interval_end`
- the producer DAG can be scheduled, manual, backfilled, or otherwise changed,
as long as it emits the expected asset partition event
This is not intended to replace asset-triggered scheduling, and I am not
proposing to derive `data_interval` for asset-triggered DAG runs. I am trying
to clarify whether there is room for a time-scheduled DAG to wait on asset
partition readiness as a task-level dependency.
This seems related to the problem discussed in #55489, where users need a
common date/partition concept across scheduled, triggered, and asset-aware
workflows. The maintainer comments there also point toward asset partitions /
`dag_run.partition_key` as the more general direction.
Questions:
1. Is this gap already intended to be covered by asset partitions in Airflow
3.2+?
2. If not, would a task-level asset partition readiness sensor fit Airflow's
direction?
3. Should such a feature live in the standard provider, core, or only be
documented as a pattern using provider-specific sensors?
4. What semantics would be acceptable: latest event exists, any event exists,
source-run-aware matching, or something else?
If this direction makes sense, I would be interested in helping with a small
scoped follow-up, likely starting with docs or a design issue before any
implementation PR.
GitHub link: https://github.com/apache/airflow/discussions/67375
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]