I'm working on implementing AIP-76 (asset partitioning). In thinking it through and doing a proof of concept, I made some design decisions that I think warrant formally amending the AIP-76 and giving notice to the community / allowing for feedback
I've documented the proposed amendment here: https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-76+amendment%3A+broader+partition+awareness Essentially, what this is about is, assets always "run" in the context of a dag. In fact they don't really "run", but rather tasks run and emit asset events. So, there's always a dag, a dag run, a task, etc. And this remains true for the new-in-3.0 asset decorator, since it's just a wrapper that generates a dag with one task. So to "run an asset" for a partition, or to "schedule an asset" in a partition-driven way, ultimately, we are running a *dag* for a partition, and "scheduling a dag" in a partition-driven way. Additionally, we currently allow dags to be scheduled based on assets, and of course people will want to have this asset-driven scheduling be partition-aware. So this is another sense in which dags must be partition-aware. Which brings me to the proposed amendment. And the amendment is essentially that, we will make DAGs explicitly partition aware. So that a DagRun will optionally have a partition_key. And this partition key is how we the task (which is updating an asset) would know what partition of the asset it is updating. Moreover we'll allow DAGs (even those not defined by asset decorator) to be partition-driven, rather than logical date-driven. And allow users to do so via both directions: (1) schedule based on partition scheme, and (2) schedule using a standard timetable but optionally use partitions instead of logical date. I welcome feedback. The amendment doc, https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-76+amendment%3A+broader+partition+awareness, is probably the best medium for this. Thank you. Daniel
