blag commented on issue #35297: URL: https://github.com/apache/airflow/issues/35297#issuecomment-1843720880
A point of clarification: `DatasetModel.extra` is different and distinct from `DatasetEvent.extra`. The two fields fill similar but distinct roles. As I noted in [this comment](https://github.com/apache/airflow/pull/36075#issuecomment-1843390310), the documentation needs to be reworded better to improve clarity on this subject. I would update the title of this to be "Dataset event extra field is not persisted", as that better describes what you want to happen. I did not ever intend for the `extra` field to be writable - at all - by Airflow tasks. I consider XComs to be _the_ only way to pass information between task instances in the same DAG run, and (less gracefully) task instances between DAG runs, and I think that is a pretty common sentiment. The only intent that I ever had (and note: I was not the author of the dataset AIP) regarding the `extra` fields for datasets and dataset events, was to allow third party integrations to easily store information from external systems that wasn't captured in Airflow's database schema, eg: to do so without forking the schema migrations. Those third party integrations were originally called `DatasetEventManager`, and were renamed `DatasetManager` before the first Airflow release that included datasets. Now, the `DatasetEvent.extra` field should be _readable_ by tasks. If it is not then that is a bug (and #36075 will not fix that). But if all you are looking to do is pass information between task instances in the same DAG run or between task instances in different DAG runs, I believe the Airflow mechanism to do this is XComs, even with data-aware Airflow. But I am perfectly happy to be corrected on this point or any other assertions I've made here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
