uranusjr commented on PR #38432: URL: https://github.com/apache/airflow/pull/38432#issuecomment-2022100403
I gave this a longer thought and I think I’m not particularly fond of the approach mainly because it mixes `Dataset.extra` and `DatasetEvent.extra`. As described in #37810, the current design is very explicit on the two being separate—`Dataset.extra` describes the thing that’s pointed to by the URI, while `DatasetEvent.extra` describes the _data_ written into the thing represented by the URI. I quoted a few paragraphs from @blag in the issue. The design here directly passes through `Dataset.extra` to `DatasetEvent.extra`, which I believe would not make sense to most people. (For the record, I actually proposed copying `Dataset.extra` to `DatasetEvent.extra` initially, but @jedcunningham convinced me to not do it before I published #37810.) The `extra_from_return` argument also has the same problem fundamentally—the flag controlling `DatasetEvent` behaviour should not be set on `Dataset`. It should be on the task instead (or on DatasetEvent itself, but that’s awkward because the event does not exist conceptually when we want to emit the extra). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
