Thanks, Vikram for bringing this up.

A big +1 from me as well. The three patterns you mentioned are very real, I
have seen users stretch XCom in all sorts of ways to fill exactly these
gaps.
The clean separation from XCom with different scoping and lifecycle makes a
lot of sense. Will go through the AIP doc in detail.

Thanks,
Rahul


On Sun, 22 Mar 2026 at 01:39, Jens Scheffler <[email protected]> wrote:

> Thanks Vikram, Jake, XD also from my side!
>
> A big +1 for moving this forward and I think this is really important.
> Though from reading over it I do not see why it is marked as DRAFT,
> because besides nt I think it is already very mature. All what I saw is
> in general "right". So I hope this is a not really controversional
> discuss and then we can get this in 3.3!
>
> (Some could say this concept is overdue... but is important to have!)
>
> Jens
>
> On 21.03.26 20:58, Jarek Potiuk wrote:
> > Thanks Vikram,
> >
> > This is a crucial AIP for Airflow 3.3+. I skimmed through it and will
> > provide more comments over the coming days, but it very much looks like
> > what I imagined for state management in Airflow.
> > It has about the right abstraction layer, focusing on building
> > infrastructure that serves the previously articulated - use cases and
> > likely supports other use cases we are not yet aware of. I really like
> how
> > it maps the "generic" interface into those cases.
> >
> > I have this old "rule of thumb": you need at least three use cases to be
> > able to design a truly reusable infrastructure API/component. .. Here we
> > have 3 use cases it will serve :)
> >
> > Jl
> >
> >
> > On Sat, Mar 21, 2026 at 8:44 PM Vikram Koka via dev <
> [email protected]>
> > wrote:
> >
> >> Dear Airflowers,
> >>
> >> Over the last several months, there have been a lot of discussions in
> the
> >> devlist around improvements needed for long running jobs outside of
> Airflow
> >> (raised by XD and others), and about improved event triggering (raised
> by
> >> Jake and others). XD, Jake, and I have gotten together and collaborated
> on
> >> a unified approach for Task State Management within Airflow which we
> would
> >> like to propose.
> >>
> >> Apache Airflow has been built around stateless, idempotent tasks, and
> this
> >> has served the community incredibly well. But as production AI and data
> >> workloads have grown more sophisticated, a clear gap has emerged that
> the
> >> community has been working around for a while.
> >>
> >> Three patterns keep coming up. An incremental operator needs to know
> where
> >> it left off last time, so it does not reprocess data it has already
> >> handled. An operator running a Databricks or EMR job needs to survive a
> >> worker disruption without cancelling a job that was 90% complete and
> >> starting over from scratch. A long-running async task processing
> thousands
> >> of files needs to checkpoint its progress so a retry picks up where it
> left
> >> off, not from the beginning.
> >>
> >> All three patterns are forcing users into the same workarounds today
> >> generally bending XCom beyond its intended purpose, or building their
> own
> >> state persistence outside of Airflow entirely.
> >>
> >> We think we can do better. AIP-XX: Task State Management is a new
> >> foundation AIP that addresses all three patterns through a single,
> minimal,
> >> pluggable framework. Built on top of the Execution API from AIP-72, with
> >> full async support consistent with AIP-98, Task State is deliberately
> and
> >> cleanly separate from XCom, with different scoping, different lifecycle
> >> semantics, and different garbage collection mechanics. It also provides
> the
> >> foundation for a simplified AIP-93 (Asset Watermarking)
> >> <
> >>
> https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-93+Asset+Watermarks+and+State+Variables
> >> and for long running remote operations using either the AIP-tbd
> Persistent
> >> Parameter for Airflow Operators
> >> <
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=399278333
> >> or AIP-96 (Resumable Operators)
> >> <
> >>
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-96+Resumable+Operators
> >> .
> >>
> >> Full draft is on Confluence as Draft AIP-xx: Task State Management
> >> <
> >>
> https://cwiki.apache.org/confluence/display/AIRFLOW/Draft%3A+AIP-xx%3A+Task+State+Management
> >> We would love to hear your thoughts. Please comment on the AIP doc.
> >>
> >> Best regards,
> >> Vikram, XD, and Jake
> >> --
> >>
> >> Vikram Koka
> >> Chief Strategy Officer
> >> Email: [email protected]
> >>
> >>
> >> <https://www.astronomer.io/>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to