uranusjr closed issue #37810: Annotate a Dataset Event in the Source Task
URL: https://github.com/apache/airflow/issues/37810
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To
uranusjr commented on issue #37810:
URL: https://github.com/apache/airflow/issues/37810#issuecomment-2028991227
Core mechanism to set `DatasetEvent.extra` is implemented in #38481. I’ll
move to implementing `yield Metadata(...)` from a task next. This might take a
while since I can see how
jscheffl commented on issue #37810:
URL: https://github.com/apache/airflow/issues/37810#issuecomment-2016560016
> This also opens the door for sending multiple things from one single
function if we allow `yield Output(...)`. I can think of future extensions that
the return value does not go
uranusjr commented on issue #37810:
URL: https://github.com/apache/airflow/issues/37810#issuecomment-2014286445
I gave this a pretty long thought. I am leaning to implementing the `return
Metadata(...)` syntax mentioned above, but with a little flair to solve the
issue it conflicts with XCo
uranusjr commented on issue #37810:
URL: https://github.com/apache/airflow/issues/37810#issuecomment-1992043317
> if task return value (==XCom) shall be taken over as `extra` event data.
So if the marker is set, the return value goes to the dataset event’s extra,
_instead of_ (not in
jscheffl commented on issue #37810:
URL: https://github.com/apache/airflow/issues/37810#issuecomment-1984323146
> With that established, if we store extra metadata (of a dataset), it only
makes sense to allow extra metadata also when an XCom is written. But if we use
XCom for the extra,
uranusjr commented on issue #37810:
URL: https://github.com/apache/airflow/issues/37810#issuecomment-1982982809
Since XCom is just a data storage, it can be used like an external S3 file,
or a database the user sets up. It is just a bit more automated and contains
some metadata. I feel it i
jscheffl commented on issue #37810:
URL: https://github.com/apache/airflow/issues/37810#issuecomment-1979621553
> I like the idea. How would this work if the task writes to more than one
dataset though?
I believe might be an option as extension to also be able to pick which XCom
as a
uranusjr commented on issue #37810:
URL: https://github.com/apache/airflow/issues/37810#issuecomment-1977590101
I like the idea. How would this work if the task writes to more than one
dataset though?
Another thing I’ve been thinking is to give XCom a dataset URI so we can
track line
jscheffl commented on issue #37810:
URL: https://github.com/apache/airflow/issues/37810#issuecomment-1977331326
Hi @uranusjr I was thinking of the same/similar feature like many many weeks
- especially in data driven use cases. We also have a DAG that potentially
generates dataset events -
jedcunningham commented on issue #37810:
URL: https://github.com/apache/airflow/issues/37810#issuecomment-1973379755
LGTM, looking forward to this 👍
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
uranusjr opened a new issue, #37810:
URL: https://github.com/apache/airflow/issues/37810
### Description
To eventually support the construct and UI we’re aiming for in assets, we
need to attach metadata to the actual data, not the task that produces it, nor
the location it is written
12 matches
Mail list logo