scr-oath opened a new issue, #33020: URL: https://github.com/apache/airflow/issues/33020
### Description Provide a mechanism to pass data (XCOM?) so that downstream DAGs could know more context about how/why/when/by-what they were triggerered. ### Use case/motivation In order to avoid writing monolithic DAGs, it would seem useful to have separate DAGs focused on discrete Input and Output transforms, which would also allow them to be retried/rescheduled as needed. One could imagine daily batch-processing comprised of several DAGs and think of using the dataset mechanism as a way to trigger efficiently. However, it seems that no information comes along with a dataset passed in each DAG's "schedule". If several days of daily tasks are (re-)scheduled, the outlet of a dataset would not be able to communicate to downstream DAGs what the "datestamp" was for them to process. As of now the dataset is just a string and, when loosely coupling a producer/consumer via the Dataset, there is no way to communicate specific information about the producer's exact output. There also doesn't appear to be a way to mix-n-match scheduling based on a dataset as well as `@daily` e.g. so there's no way to connect a particular day's producer DAG with a consumer DAG. If a task could query its lineage and specifically get data / XCOM information from the DAG/task/Dataset that triggered it, then it could take efficient actions based on the previous task's specific output location (i.e. its datestamp directory if that's the convention, but could be anything, really if a general way of passing/receiving data were provided.) ### Related issues _No response_ ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
