collinmcnulty opened a new issue, #49900: URL: https://github.com/apache/airflow/issues/49900
### Description There should be an option on the AssetWatcher to respect idempotency in the payload of a TriggerEvent could include when yielded, so that a new DAG is only created when a new key shows up. This is already the case for task sensors for HA purposes, so the DAG creation logic surely could respect that too. This way you could make S3KeyTrigger and similar triggers work by having them put the name of the key that was detected (or modified time if the key is fully defined by the trigger) and put that in the TriggerEvent payload. Then only one DAG would run per time the condition became true, rather than scheduling infinitely. ### Use case/motivation I was looking to turn a DAG into one with event-driven scheduling and the [notice on infinite scheduling](https://airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/event-scheduling.html#avoid-infinite-scheduling) gave me real pause. It seems like the trigger itself needs to consume the message, whereas my expectation is that most data engineers would find it more natural that the trigger would detect the condition and the running of the DAG would cause the condition to stop being true. E.g. trigger detects a file in a location and the DAG does something with the file then deletes it. But instead what happens is that the [trigger has to delete the file](https://github.com/apache/airflow/blob/main/providers/standard/src/airflow/providers/standard/triggers/file.py#L127). ### Related issues #49857 would also be fixed by this. Let files pile up in a location with the DAG paused, turn the DAG back on, and suddenly you get an event per file. ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
