uranusjr commented on issue #37810: URL: https://github.com/apache/airflow/issues/37810#issuecomment-1977590101
I like the idea. How would this work if the task writes to more than one dataset though? Another thing I’ve been thinking is to give XCom a dataset URI so we can track lineage of its values (also tieing back to the read/write to XCom via Object Store idea). This raises a question, what should we do if we want to use XCom for both the “actual” data, if it is already used for extra? Eventually what I think we should do is to provide some sort of “output management” mechanism that generalises XCom—if XCom is a kind of dataset, its metadata is conceptually just automatically populated dataset metadata. So the return value should still be the actual data we want to write (with where and how the data is stored being customisable), and downstream tasks depend on, and metadata should be provided by another way. I’m not entire sure how the end result should look like, or how to smoothly transition toward it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
