blag commented on issue #35297:
URL: https://github.com/apache/airflow/issues/35297#issuecomment-1843720880

   A point of clarification: `DatasetModel.extra` is different and distinct 
from `DatasetEvent.extra`. The two fields fill similar but distinct roles. As I 
noted in [this 
comment](https://github.com/apache/airflow/pull/36075#issuecomment-1843390310), 
the documentation needs to be reworded better to improve clarity on this 
subject.
   
   I would update the title of this to be "Dataset event extra field is not 
persisted", as that better describes what you want to happen.
   
   I did not ever intend for the `extra` field to be writable - at all - by 
Airflow tasks. I consider XComs to be _the_ only way to pass information 
between task instances in the same DAG run, and (less gracefully) task 
instances between DAG runs, and I think that is a pretty common sentiment.
   
   The only intent that I ever had (and note: I was not the author of the 
dataset AIP) regarding the `extra` fields for datasets and dataset events, was 
to allow third party integrations to easily store information from external 
systems that wasn't captured in Airflow's database schema, eg: to do so without 
forking the schema migrations. Those third party integrations were originally 
called `DatasetEventManager`, and were renamed `DatasetManager` before the 
first Airflow release that included datasets.
   
   Now, the `DatasetEvent.extra` field should be _readable_ by tasks. If it is 
not then that is a bug (and #36075 will not fix that). But if all you are 
looking to do is pass information between task instances in the same DAG run or 
between task instances in different DAG runs, I believe the Airflow mechanism 
to do this is XComs, even with data-aware Airflow.
   
   But I am perfectly happy to be corrected on this point or any other 
assertions I've made here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to