error418 opened a new issue, #35297:
URL: https://github.com/apache/airflow/issues/35297

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### What happened
   
   Dataset `extra` property cannot be retrieved in dependent tasks.
   
   It seems that the information is not passed through the facades:
   
https://github.com/apache/airflow/blob/55b015f995def3bc8a3a9eef6abd7bcad49888f7/airflow/datasets/manager.py#L47-L74
   
   The information contained in the `extra` property gets lost in following 
call, due to the omitted `extra` parameter of `register_dataset_change`:
   
https://github.com/apache/airflow/blob/55b015f995def3bc8a3a9eef6abd7bcad49888f7/airflow/models/taskinstance.py#L2337-L2346
   
   
   
   ### What you think should happen instead
   
   Dataset events should not be always an empty dict when retrieving the 
`Dataset` from the tasks `triggering_dataset_events`. Instead, the provided 
contents of the `extra` dict should be returned.
   
   ### How to reproduce
   
   - Create a DAG with a Task providing a Dataset in the `outlets`.
   
   ```python
     task.outlets.append(Dataset("dataset_uri", extra=dict(test="1", 
another="2"))
   ```
   - Create a data-aware DAG for this Dataset
   - Try to retrieve the information in `extra`
   
   ```python
   for ds_uri, ds_events in triggering_dataset_events.items():
     LOG.info("%s:", ds_uri)
     for ds_event in ds_events:
       LOG.info("  %s -- %s", ds_event.dataset, ds_event.extra)
   ```
   
   ### Operating System
   
   Official Airflow Image
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   Airflow 2.7.1
   deployed on k8s, using the Airflow chart
   
   ### Anything else
   
   This matter seems also to be discussed in 
https://github.com/apache/airflow/discussions/31542
   
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to