SCrocky opened a new issue, #56750:
URL: https://github.com/apache/airflow/issues/56750
### Apache Airflow version
3.1.0
### If "Other Airflow 2/3 version" selected, which one?
_No response_
### What happened?
### Asset scheduling behaviors
Asset Event triggered DAGs behave one of 3 different ways:
1. A single Asset Event triggers a single DAG Run
2. Multiple Asset Events trigger a single DAG Run
3. Asset Events that haven't triggered a DAG Run, but are older than the
last run are silently ignored
### How to make Datasets Behave differently
To force behavior 2 & 3 to happen, one can set `max_active_runs=1` and every
time the DAG runs it wall "consume" (either via behavior 2 or 3) all available
Asset Events.
To force behavior 1, one must set `max_active_runs` to a high value, and
hope that Asset Events are not generate faster than the scheduler runs (or else
we fall into behavior 2)
It is important to note that the `catchup` argument does not seem to affect
this mechanic in any way.
### The main Issue
The main issue here is:
### Asset Event Scheduling behaves in very different ways, based on DAG
parallelism & Airflow Scheduler performance
These things should be unrelated, and as far as I could tell, this behavior
is undocumented.
### Linked Issues
Other issues that would likely be solved by addressing this issue:
https://github.com/apache/airflow/issues/56749 (UI changes)
https://github.com/apache/airflow/issues/53896 (distinct DAG Run per Asset
Event)
https://github.com/apache/airflow/issues/50890 (want catchup on Assets)
https://github.com/apache/airflow/issues/56691 (distinct DAG Run per Asset
Event)
https://github.com/apache/airflow/issues/56050 (Max active runs = 1 changes
behavior)
https://github.com/apache/airflow/issues/55956 (Force separate Events)
Unclear issues that may be related:
https://github.com/apache/airflow/issues/56541 ? (unclear)
https://github.com/apache/airflow/issues/42015 ? (unclear)
### What you think should happen instead?
In my professional setting we use both behavior 1 (for Event based
scheduling) and behavior 2 & 3 (for table refreshes). Check out my [Talk from
Airflow Summit
2025](https://airflowsummit.org/sessions/2025/multi-instance-asset-synchronization-push-or-pull/)
for more details.
So I suggest we make the Asset Event DAG triggering behavior configurable on
a DAG level.
For example by adding a `asset_grouping` argument:
- if `asset_grouping=True` then we have behavior 2
- if `asset_grouping=False` then we have behavior 1
Behavior 3 is a bug in my opinion and should never happen.
I've put more info on the Asset Event attribution in [this
issue](https://github.com/apache/airflow/issues/56749)
I also suggest we rename `catchup` to `time_interval_catchup` or some
similar value, so that it is clear it does not apply to Asset Event based
scheduling.
And we should document all this stuff.
### How to reproduce
To reproduce simply upload the following DAGs in a brand new Airflow
instance:
[check_dataset_sync.py](https://github.com/user-attachments/files/22961935/check_dataset_sync.py)
make sure to use a DB other than SQlite so you can compare the difference
between `max_active_runs=1` and `max_active_runs=10`.
Then use the `airflow standalone` command.
Turn all the DAGs on.
You should obtain the following DAGs:

And manually trigger the asset generator DAG once.

You will then see that the non-parallel DAGs only trigger twice, and the
parallel DAG triggers 4-5 times, depending on scheduler frequency.
You can check the logs to see how many Asset Events each DAG is consuming:

You can also do similar tests for Event Driven Asset Events:
[event_scheduling_test.py](https://github.com/user-attachments/files/22961969/event_scheduling_test.py)
But be sure to add your dags repo to the PYTHONPATH `export
PYTHONPATH=$AIRFLOW_HOME/dags`
### Operating System
Ubuntu 24
### Versions of Apache Airflow Providers
```
apache-airflow-providers-common-compat 1.7.3
apache-airflow-providers-common-io 1.6.2
apache-airflow-providers-common-sql 1.27.5
apache-airflow-providers-postgres 6.2.3
apache-airflow-providers-smtp 2.2.0
apache-airflow-providers-standard 1.6.0
```
### Deployment
Virtualenv installation
### Deployment details
Using postgres for the Airflow DB
### Anything else?
@cmarteepants I've finally gotten around to making this issue as previously
discussed.
Let me know if everything is clear and understandable.
@uranusjr enjoy ;)
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]