jscheffl commented on PR #38432: URL: https://github.com/apache/airflow/pull/38432#issuecomment-2024093337
> I gave this a longer thought and I think I’m not particularly fond of the approach mainly because it mixes `Dataset.extra` and `DatasetEvent.extra`. As described in #37810, the current design is very explicit on the two being separate—`Dataset.extra` describes the thing that’s pointed to by the URI, while `DatasetEvent.extra` describes the _data_ written into the thing represented by the URI. I quoted a few paragraphs from @blag in the issue. The design here directly passes through `Dataset.extra` to `DatasetEvent.extra`, which I believe would not make sense to most people. (For the record, I actually proposed copying `Dataset.extra` to `DatasetEvent.extra` initially, but @jedcunningham convinced me to not do it before I published #37810.) > > The `extra_from_return` argument also has the same problem fundamentally—the flag controlling `DatasetEvent` behaviour should not be set on `Dataset`. It should be on the task instead (or on DatasetEvent itself, but that’s awkward because the event does not exist conceptually when we want to emit the extra). Thanks for sharing your concerns. After reading #38481 I better understand what you were meaning. I did not understand the difference between `Dataset.extra` (Which I always felt like "Dataset" being the holder for the URI as pointer towards a dataset) and the `DatasetEvent.extra` (which I did not notice before, expected this is the result of the outlet from execution not a definition made by the user). You ehancement in the docs in PR #38481 makes it now clear. Therefore I'm now in favor of PR #38481 where you propose a simple but explicit possibility to attach extra per outlet w/o mixing with return or XCom. Yes, and in-deed, the merging of multiple extras as DAG run conf is something separate - thought it is a 2-for-1 in this PR to publish the `extra` dynamically here as well as show how to easily consume it. So I will close this "test baloon" PR here and "maybe" will open a separate PR for the topic of using extra as DAG run conf... if time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
