Hi all, We have (finally) stabilised the proposal around assets, and would like to propose them formally for discussion.
Data Awareness is a very broad topic, and we plan to do a lot of things in it. Therefore, AIP-73 is made an umbrella AIP that discusses the topic at a very high level, and outlines what we want to achieve. https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-73+Expanded+Data+Awareness AIP-74 (Introducing Data Assets) introduces the Asset class, which is mostly just renaming Dataset, with a couple of additional arguments. https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-74+Introducing+Data+Assets AIP-75 (New Asset-Centric Syntax) proposes a new @asset syntax to simply the interface to define a construct that contains a scheduled body of work emitting data to an asset, with optional dependencies to other such bodies of work. https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-75+New+Asset-Centric+Syntax Although they have stayed in WIP status for a while, people have already made a lot of comments and suggestions to them, to which I’m super grateful! I hope for more discussion on this topic to help the proposals toward acceptance. AIPs 76 (Partitions) and 77 (Validations) are still in WIP status for now. I am currently working on 76 and aiming to provide an update early next week. We plan to defer 77 until after Airflow 3.0 since it is strictly incremental work and does not require any significant change to existing features in Airflow. TP