Reviewed the proposal.

I'll share some thoughts below.

But one general thing I want to emphasize as we work towards implementation
of the AIP is, this is not going to be easy.  Airflow has a lot of
complexity at this point and a lot of interacting interfaces.  Dags, tasks,
assets, watchers, assets that are defined on their own, assets that are
defined as part of a dag, assets updated from triggers, asset aliases....
And when we add partitioning into the mix, it sort of has the potential to
multiply the complexity.  So, for the initial release of asset
partitioning, I think we have to keep it as simple as we can and try to
avoid trying to do too much in the first introduction of the feature.  I
think we need to focus on the most basic scenario, namely, assets that are
partitioned by time windows, and focus on how to implement that and
reconciling all the implications for all the other interfaces that are ...
implicated in that change.  I think that everything else, probably makes
sense to defer until we get out that initial implementation of the core
feature.  There will be enough to sort out with just that.  And even with
just focusing on the most simple thing, I expect we'll have to come to the
list a few times over the next few months to resolve questions about how to
reconcile these things and what the behavior should be.

Now, moving on to your proposal document specifically, one thing that
stands out to me is you do not really engage in much dialogue with the
existing AIP, AIP-76
<https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-76%2BAsset%2BPartitions>,
(authored by TP and accepted last year).  Of course, all AIPs change during
implementation.  But I'm just not sure how to interpret your proposal.  Is
this meant to be added to the existing AIP?  Or are there components of the
AIP you wish to replace with your proposal?  I think it would be helpful
for you to engage more directly with the AIP, and be direct about what your
goals are, what you think needs to be changed, how your proposal fits in
etc.  Maybe better to frame it as specific proposed amendments to the AIP,
rather than leaving it to us to figure out the implications.

For example, you introduce a completeness concept to handle partition
mapping (as it is sometimes called).  But the existing AIP already
discusses its approach to partition mapping.  Here's an excerpt from the
AIP:

If you want a downstream to aggregate multiple partitions from the
> upstream, you can do
> @asset(schedule=hourly_data, partition=PartitionByInterval("@daily"))
> def aggregated_daily_data():
> ...
> Every partition of this asset depends on 24 partitions of hourly_data of
> the day.

So, the existing AIP says that by default, the daily asset should be mapped
to the 24 hourly partitions that align with the partition implied by the
daily partition scheme.

Interestingly, dagster has a partition mapping interface, and if you don't
provide it, it doesn't assume there should be any mapping.  I kindof like
that approach (explicit over implicit).  And I like the language of
partition mapping better than the "completeness" language.

You also propose that asset event producers can emit partition info along
with the asset event.  Which seems reasonable enough.  But, here too, TP
already provided in the AIP a mechanism for an asset to record what
partition it's dealing with (in the case of "dynamic" partitions).  And
otherwise, shouldn't the asset already know what partition it's supposed to
be dealing with?

Thanks

On Tue, Sep 2, 2025 at 8:39 AM Constance Martineau <consta...@astronomer.io>
wrote:

> Hi Hussein,
>
> Thanks for creating this. @Daniel Standish <daniel.stand...@astronomer.io>
> , @Tzu-ping Chung <t...@astronomer.io> and I (well, mostly them :) ) will
> take a look. We have started defining an implementation plan, but it's
> still early so perfect timing.
>
> Constance
>
> On Sun, Aug 31, 2025 at 6:23 PM Hussein Awala <huss...@awala.fr> wrote:
>
>> Hi all,
>>
>> I’m not sure if the Astronomer team has already started work on
>> implementing *AIP-76*, but I’ve prepared a proposal for how we could
>> approach the implementation.
>>
>> The proposal covers:
>>
>>    -
>>
>>    Extending the asset/event model to support partitions
>>    -
>>
>>    A normalized schema for asset event partitions
>>    -
>>
>>    Watermark- and completeness-based scheduling (daily/weekly/monthly and
>>    optional rolling windows)
>>    -
>>
>>    Handling of re-processed partitions
>>
>> You can find the proposal document here:
>>
>> https://docs.google.com/document/d/17RMpjronpNerqHBN-KwNn0jjscrSYJzSDhAakentfd0/edit?usp=sharing
>>
>> I’d appreciate your feedback and review. My suggestion is that we start
>> implementation after the *Airflow 3.1 release.*
>>
>>
>> Looking forward to your thoughts,
>>
>> Hussein
>>
>

Reply via email to