uranusjr commented on issue #37810:
URL: https://github.com/apache/airflow/issues/37810#issuecomment-1992043317
> if task return value (==XCom) shall be taken over as `extra` event data.
So if the marker is set, the return value goes to the dataset event’s extra,
_instead of_ (not in addition to) the `xcom` table (XCom model)?
I think what makes me feel uncomfortable about using XCom is that the model
doesn’t contain a special semantic to data stored in it. It is more likely at
least some people use it as a generic storage for data, instead of metadata (of
the data). This means we can’t have a guaranteed way to tell if a value in
there is supposed to be metadata (that’s associated to another data), or random
data. But if metadata does not go into the table (but somewhere else) instead),
I think that’s fine.
Anotherway to do this would be to introduce a special type to return from a
task function, like
```python
from airflow.datasets import Dataset, Metadata
from airflow.decorators import task
@task(outlets=[Dataset("s3://my/data.json")])
def my_task():
with ObjectStoragePath("s3://my/data.json").open("w") as f:
... # Write to file...
return Metadata(uri="s3://my/data.json", extra={"extra": "metadata"})
```
This is maybe more visible than setting a flag
```python
# easier to miss?
@task(outlets=[Dataset("s3://my/data.json", event_extra_source="xcom")])
def my_task():
with ObjectStoragePath("s3://my/data.json").open("w") as f:
... # Write to file...
# Need to double check above to understand what this return implies.
return {"extra": "metadata"}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]