Reviewed the proposal. I'll share some thoughts below.
But one general thing I want to emphasize as we work towards implementation of the AIP is, this is not going to be easy. Airflow has a lot of complexity at this point and a lot of interacting interfaces. Dags, tasks, assets, watchers, assets that are defined on their own, assets that are defined as part of a dag, assets updated from triggers, asset aliases.... And when we add partitioning into the mix, it sort of has the potential to multiply the complexity. So, for the initial release of asset partitioning, I think we have to keep it as simple as we can and try to avoid trying to do too much in the first introduction of the feature. I think we need to focus on the most basic scenario, namely, assets that are partitioned by time windows, and focus on how to implement that and reconciling all the implications for all the other interfaces that are ... implicated in that change. I think that everything else, probably makes sense to defer until we get out that initial implementation of the core feature. There will be enough to sort out with just that. And even with just focusing on the most simple thing, I expect we'll have to come to the list a few times over the next few months to resolve questions about how to reconcile these things and what the behavior should be. Now, moving on to your proposal document specifically, one thing that stands out to me is you do not really engage in much dialogue with the existing AIP, AIP-76 <https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-76%2BAsset%2BPartitions>, (authored by TP and accepted last year). Of course, all AIPs change during implementation. But I'm just not sure how to interpret your proposal. Is this meant to be added to the existing AIP? Or are there components of the AIP you wish to replace with your proposal? I think it would be helpful for you to engage more directly with the AIP, and be direct about what your goals are, what you think needs to be changed, how your proposal fits in etc. Maybe better to frame it as specific proposed amendments to the AIP, rather than leaving it to us to figure out the implications. For example, you introduce a completeness concept to handle partition mapping (as it is sometimes called). But the existing AIP already discusses its approach to partition mapping. Here's an excerpt from the AIP: If you want a downstream to aggregate multiple partitions from the > upstream, you can do > @asset(schedule=hourly_data, partition=PartitionByInterval("@daily")) > def aggregated_daily_data(): > ... > Every partition of this asset depends on 24 partitions of hourly_data of > the day. So, the existing AIP says that by default, the daily asset should be mapped to the 24 hourly partitions that align with the partition implied by the daily partition scheme. Interestingly, dagster has a partition mapping interface, and if you don't provide it, it doesn't assume there should be any mapping. I kindof like that approach (explicit over implicit). And I like the language of partition mapping better than the "completeness" language. You also propose that asset event producers can emit partition info along with the asset event. Which seems reasonable enough. But, here too, TP already provided in the AIP a mechanism for an asset to record what partition it's dealing with (in the case of "dynamic" partitions). And otherwise, shouldn't the asset already know what partition it's supposed to be dealing with? Thanks On Tue, Sep 2, 2025 at 8:39 AM Constance Martineau <consta...@astronomer.io> wrote: > Hi Hussein, > > Thanks for creating this. @Daniel Standish <daniel.stand...@astronomer.io> > , @Tzu-ping Chung <t...@astronomer.io> and I (well, mostly them :) ) will > take a look. We have started defining an implementation plan, but it's > still early so perfect timing. > > Constance > > On Sun, Aug 31, 2025 at 6:23 PM Hussein Awala <huss...@awala.fr> wrote: > >> Hi all, >> >> I’m not sure if the Astronomer team has already started work on >> implementing *AIP-76*, but I’ve prepared a proposal for how we could >> approach the implementation. >> >> The proposal covers: >> >> - >> >> Extending the asset/event model to support partitions >> - >> >> A normalized schema for asset event partitions >> - >> >> Watermark- and completeness-based scheduling (daily/weekly/monthly and >> optional rolling windows) >> - >> >> Handling of re-processed partitions >> >> You can find the proposal document here: >> >> https://docs.google.com/document/d/17RMpjronpNerqHBN-KwNn0jjscrSYJzSDhAakentfd0/edit?usp=sharing >> >> I’d appreciate your feedback and review. My suggestion is that we start >> implementation after the *Airflow 3.1 release.* >> >> >> Looking forward to your thoughts, >> >> Hussein >> >