Thank you all for your answers. I appreciate your detailed responses. I am changing my vote to +1 (binding)
Vikram On Tue, Jan 20, 2026 at 10:22 AM Jarek Potiuk <[email protected]> wrote: > That's a sign that the documentation for it is really needed and likely > that it should be written by someone who has not spent hours discussing it > and makes a lot of mental shortcuts :). Also - I think I had very similar > questions at the beginning of discussions with David, only to find out that > I had to challenge my own "airflow-centric" approach and then it started to > make sense. > > I will leave it to David to provide his - excellent - examples from the > work he was doing at his day job and the performance gains he got. But let > me try to clarify the small/huge confusion. > > There are two performance aspects of this one and a pattern that we should > - again indeed - document better. > > 1) many small async requests > 2) that can often produce huge responses that needs to be dealt with > > Both are addressed actually by this "syntactic sugar" - even if they might > seem contradicting (and request/response is what matters here). > > 1) *many small async requests:* the case I related to: when you make a > lot of small async requests, the overhead to run such requests in a local > async loop is very small (and this is actually used in Triggerrer - which > shares multiple async calls from a number of tasks that are waiting for > something - it can easily run thousands of such async requests - and handle > responses efficiently). But if you want to run **many** of similar requests > concurrently in the "native" Airflow way - using DeferrableOperator, in > Airflow currently you have to run Mapped tasks. This introduces overhead: > creating DagRuns with say - thousands of TaskInstance entities, starting > thousands of worker "processes" (David works on optimising this part as > well in [1] [2] ), communication between those processes and DB for > triggerer, picking up the tasks by triggerer from the DB, > serialization/deserialization of the answers and sending them to the > mappeed workers, finally - possibly "reducing" the output from those > multiple tasks to be read in some downstream task that might need those > outputs to be combined. All this overhead is gone (completely) if you run > all those operations in the worker - immediately in a dedicated async loop > - rather than doing all the worker -> DB -> Triggerrer -> worker dance. > > This is all without any "huge" payload returned. For small tasks - the > overhead here is very real (orders of magnitude ) and David has numbers to > back it up from his own experience. > > There are few differences of course vs. Deferrable operators: > a) you can't track individual tasks (also you can't retry them > individually) > b) you have to handle the errors in this worker that runs them (answering > your question) > c) you do not handle persistence of the intermediate results - they all > are stored in memory by default > d) worker is not freed while waiting - it is busy running the async > loop. > > But - you can immediately and natively use async hooks we developed for > Deferrable Operators - without worrying about starting and managing the > loop yourself (today you could do the same without the `async task` sugar, > but you would have to repeat the async loop initialization code in each > such task and the loop would not be "airflow" managed (which will come > handy in the future). > > 2) *many async requests with large payloads*: The case that Daniel talked > about - which is similar to that above, but also involves potentially > "large" payload returned - in a number of cases the returned data from the > async tasks that needs to be further processed is huge, or just "big". > Either enough to fit in-memory or too big, but supporting streaming async > interface we can get chunks of it at a time. In such a case if you use the > classic "Deferrable Operator", you would need to get that data from > multiple mapped tasks, and store them in XCom, so that the downstream task > will possibly aggregate and process the data. With "natively async tasks" > from AIP-98 - all that can be "compacted" into a single async task running > a number of parallel tasks - possibly streaming the payload and processing > it and producing aggregated output as a smaller "in-memory" data being > output of all those async responses - and storing that as an XCom to > downstream tasks. So additionally to the overhead from 1) above that I > wrote about, the 2) XCom overhead that David wrote about is added. Again - > in in this case you are not really using integration with Airflow UI - i.e. > those monitoring and management features that we already have - seeing logs > individually, seeing data returned individually, ability to partially > reprocess such mapped tasks - all this is gone, because essentially we > compact it all into a single process running separate async loop and doing > both "map" and "reduce" part of what mapped tasks are designed for - > without the monitoring/management, but also without the overhead it causes. > > This could - again - be implemented now as a custom code run in > "synchronous" tasks. You can create an async loop and write your own async > methods and add logic to start and wait for them in your synchronous tasks. > But contrary to the AIP-98 - it can't be extended in the future to provide > better integration with Airflow UI. > > When the async loop will be "managed" by Airflow - AIP-99 and "async task" > will become a "first class citizen" - we can later leverage async loop > monitoring for example TaskGroups that allow to handle "group exceptions, > (introduced in Python 3.11) - and better monitoring interfaces of asyncio > (introduced later). This will allow our users to track progress of > execution of such async calls running or introduce some more sophisticated > retry scenarios when some of those async hook calls fail. This all could be > exposed via Airflow UI with optimised execution API handling bulk status > update and status querying. Not the full functionality of what Mapped Tasks > provide - but a useful subset of those - for those who will be willing to > trade some manageability aspects for performance. > > I hope it might help to clear it up further. > > J. > > > [1] https://github.com/apache/airflow/pull/55068 > [2] https://github.com/apache/airflow/pull/53009 > > On Mon, Jan 19, 2026 at 10:01 PM Vikram Koka <[email protected]> wrote: > >> David and Jarek, >> >> Your responses together have clarified in one dimension and confused me >> more in another dimension. >> >> Clarified for me: >> - This is not a replacement for DeferrableOperators in most cases. >> - This is intended to be complementary to DeferrableOperators and based >> on the use case you would use one vs. the other. >> >> Confused me further: >> - What those specific use cases are: >> From David's response, I understand that the use case best suited for >> async Operators would be: >> - For large payloads, these would overwhelm the Airflow metadatabase, >> since DeferrableOperators do not have access to external XCom storage >> systems. >> >> From Jarek's response, I understand that the use case best suited for >> async Operators would be: >> - For a number of small, async operations to be done possibly >> concurrently, leveraging async I/O. >> >> I understand David's explanation a bit better, possibly because I can >> relate to it from a concrete use case perspective. >> The follow up questions I do have are about changes in task behavior with >> respect to task retries, as well as how / where intermediate task failures >> should be handled. >> This also raises the interaction with other AIPs such as watermarks, >> resumable operators, and so on. >> But, setting those aside, just trying to think through how this should be >> represented to the user i.e. DAG author. This strikes me as an "advanced >> use case", but still learning. >> >> I don't understand Jarek's explanation. Can you please clarify with a >> concrete use case? >> >> Best regards, >> Vikram >> >> >> >> >> On Sun, Jan 18, 2026 at 4:03 AM Jarek Potiuk <[email protected]> wrote: >> >>> Yeah. I would absolutely see this as complementary, not even trying to >>> replace Deferrable Operators. I think we should make it clear in the >>> documentation to not confuse people but the use cases and behaviours there >>> are different. Really, it has one thing in common - >>> both DeferrableOperators and Async support for Python Operators can easily >>> leverage "async Hooks". >>> >>> For me, the name of DeferrableOperators explains it all (and there is a >>> good reason we did not name it AsyncOperators). The distinction I see: >>> >>> * Deferrable Operators are good, when you have generally synchronous >>> operation that should be Deferred for later (usually much later) >>> * AsyncPythonOperator is good when you want to do a number of small, >>> async operations possibly concurrently, but you do not want to defer those, >>> you simply leverage capabilities of async I/O operations being able to run >>> concurrently (note - not in parallel - but concurrently - using single >>> worker CPU and async I/O non-GIL operations for multiplexing many >>> operations. >>> >>> Those are very, very distinct use cases, I would even say they do not >>> have anything in common (except using async hooks). >>> >>> Also, what AsyncPythonOperator does was essentially possible before - >>> with some boilerplate async loop utilisation code. So really what AIP-98 >>> does is adding a syntactic sugar and hiding the async loop management code, >>> to make it a) easier b) native for airflow with `async def task()` c) more >>> discoverable by our users (providing we will iterate on documentation and >>> examples). >>> >>> In the future (what David mentioned) it opens up for better integration >>> with async task monitoring - for example so that we could see progress of >>> those concurrent tasks in Airflow UI other than looking at logs, and things >>> like more reusable "standard" operators (like IterableOperator). I'd say >>> it's a really foundational change to recognise the "single worker async >>> multiplexing" as native-airflow feature - which will bring some nice things >>> in the future. >>> >>> J. >>> >>> >>> >>> On Sat, Jan 17, 2026 at 9:29 AM Blain David <[email protected]> >>> wrote: >>> >>>> Hello Vikram, >>>> >>>> Thank you for your reply. >>>> >>>> To be clear, no I'm not deprecating deferrable operators, it just >>>> depends on what the operator does: >>>> >>>> 1. If the operator is deferrable because it needs to use an async hook >>>> to retrieve huge payloads from a paginated API, then yes, I would prefer >>>> the async operator over the deferred one, like for example the >>>> MSGraphAsyncOperator or the HttpOperator. >>>> The reason why is what was also explained in the devlist discussion >>>> before, you're literally overloading the triggers (in memory) and the >>>> Airflow metadatabase (triggers table) with huge payloads, >>>> something triggers are not designed for (but you could) as triggers >>>> don't have like an XCom backend which you can easily replace with another >>>> one, so you're stuck with storing the payloads (trigger events) in the >>>> Airflow database table. >>>> >>>> 2. If the operator is deferrable because it needs to do polling to >>>> determine it succeeded or not, then yes, it makes sense, for example I just >>>> started a PR (https://github.com/apache/airflow/pull/60651) >>>> to fix an issue related to polling in the WinRMOperator which blocks >>>> the worker for no reason as it just awaits an answer, it's similar to point >>>> one but here the payload in the triggers is small and so is the execution >>>> time. >>>> >>>> So yes, in some cases I would advocate to use the BaseAsyncOperator, >>>> but in other cases not, it all depends on the responsibility of the >>>> operator and what you're doing with. >>>> AIP-98 also opens the door to implement the IterableOperator in the >>>> future which was also discussed mostly with Jarek in the devlist ( >>>> https://lists.apache.org/thread/ztnfsqolow4v1zsv4pkpnxc1fk0hbf2p ) as >>>> he knows what the idea behind there is, but that's also still work in >>>> progress. >>>> >>>> On the other hand deferrable operators also have a huge advantage as >>>> they rely on triggers and that is that it allows us to implement the >>>> "streaming" mechanism or the lazy dynamic task mapping expansion I >>>> explained in AIP-88 ( >>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=334760511 >>>> ) and which I presented on the Summit in Seattle last year. Once PR #55068 >>>> (https://github.com/apache/airflow/pull/55068/) is merged, I will >>>> continue working on that one as well. >>>> >>>> So yes, it all depends on the use case. >>>> >>>> I hope this makes it a bit more clear to you. >>>> >>>> David >>>> >>>> -----Original Message----- >>>> From: Vikram Koka via dev <[email protected]> >>>> Sent: 16 January 2026 19:58 >>>> To: [email protected] >>>> Cc: Vikram Koka <[email protected]> >>>> Subject: Re: [VOTE] AIP-98: Add async support for PythonOperator in >>>> Airflow 3 >>>> >>>> EXTERNAL MAIL: Indien je de afzender van deze e-mail niet kent en deze >>>> niet vertrouwt, klik niet op een link of open geen bijlages. Bij twijfel, >>>> stuur deze e-mail als bijlage naar [email protected]<mailto: >>>> [email protected]>. >>>> >>>> Hey David, >>>> >>>> Just read the AIP and posted questions on the Confluence page as well. >>>> >>>> I find this very interesting and am *overall supportive*, but I have >>>> several questions about usage and user / developer guidance. >>>> Specifically around what we should be recommending around what user >>>> situations. I put the following question in the confluence page as well: >>>> >>>> >>>> - You are making a strong case for supporting async within the >>>> PythonOperator pattern over Deferrable Operators. >>>> - What I am missing is when should users be using Deferrable >>>> Operators >>>> instead? >>>> - Also, are you advocating deprecating Deferrable Operators >>>> entirely? I >>>> am not opposed to it, but definitely something I am curious about >>>> your >>>> viewpoint here. >>>> >>>> >>>> Until then, I would vote >>>> -0.5 (binding) >>>> >>>> I am absolutely willing and intend to change my vote, just want to get >>>> questions answered first is all. >>>> >>>> These are questions which any user would have and I therefore believe >>>> it is important to address as part of making and merging this change. >>>> >>>> Vikram >>>> >>>> >>>> >>>> On Fri, Jan 16, 2026 at 9:14 AM Dheeraj Turaga <[email protected] >>>> > >>>> wrote: >>>> >>>> > +1 (binding) >>>> > >>>> > Sriraj Dheeraj Turaga >>>> > >>>> > On Fri, Jan 16, 2026 at 9:19 AM Shahar Epstein <[email protected]> >>>> wrote: >>>> > >>>> > > +1 (binding) >>>> > > >>>> > > On Fri, Jan 16, 2026 at 3:39 PM Blain David >>>> > > <[email protected]> >>>> > > wrote: >>>> > > >>>> > > > Hi Everyone, >>>> > > > >>>> > > > >>>> > > > >>>> > > > I would like to be calling a vote on this AIP: >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > >>>> > >>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwik >>>> > i.apache.org >>>> %2Fconfluence%2Fdisplay%2FAIRFLOW%2FAIP-98%253A%2BAdd%2Bas >>>> > ync%2Bsupport%2Bfor%2BPythonOperator%2Bin%2BAirflow%2B3&data=05%7C02%7 >>>> > Cdavid.blain%40infrabel.be >>>> %7C56d6ae2f0e904b8c30d608de55315684%7Cb82bc3 >>>> > 14ab8e4d6fb18946f02e1f27f2%7C0%7C0%7C639041867613554992%7CUnknown%7CTW >>>> > FpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIs >>>> > IkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=IEe7gFlEP5XQwYbFMM >>>> > 8LkPw%2Bp2Yr0IuBS%2BIp1SSAr1o%3D&reserved=0 >>>> > > > >>>> > > > There was already a discussion in the devlist regarding this >>>> proposal: >>>> > > > >>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2F >>>> > > > lists.apache.org >>>> %2Fthread%2Fztnfsqolow4v1zsv4pkpnxc1fk0hbf2p&data= >>>> > > > 05%7C02%7Cdavid.blain%40infrabel.be >>>> %7C56d6ae2f0e904b8c30d608de5531 >>>> > > > 5684%7Cb82bc314ab8e4d6fb18946f02e1f27f2%7C0%7C0%7C6390418676135806 >>>> > > > 65%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDA >>>> > > > wMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C >>>> > > > &sdata=ccy%2Fec1OCrAyvQorRAEvhuPMDuslWEep9fFNNiT6r7o%3D&reserved=0 >>>> > > > >>>> > > > This AIP is already implemented and merged as a PR: >>>> > > > >>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2F >>>> > > > github.com >>>> %2Fapache%2Fairflow%2Fpull%2F60268&data=05%7C02%7Cdavid. >>>> > > > blain%40infrabel.be >>>> %7C56d6ae2f0e904b8c30d608de55315684%7Cb82bc314a >>>> > > > b8e4d6fb18946f02e1f27f2%7C0%7C0%7C639041867613602603%7CUnknown%7CT >>>> > > > WFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4 >>>> > > > zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=c4BiguOx6 >>>> > > > kmqIlWBlNIf6SrO4qN2baTwwFuDBaDmBX8%3D&reserved=0 >>>> > > > >>>> > > > The vote will run for 5 days and last till next thursday, the >>>> 22th >>>> > > > of >>>> > Jan >>>> > > > 2026 23:30 GMT. >>>> > > > >>>> > > > >>>> > > > >>>> > > > Everyone is encouraged to vote, although only PMC members and >>>> > Committers' >>>> > > > votes are considered binding. >>>> > > > >>>> > > > >>>> > > > >>>> > > > Please vote accordingly >>>> > > > >>>> > > > >>>> > > > >>>> > > > [ ] +1 Approve >>>> > > > >>>> > > > [ ] +0 no opinion >>>> > > > >>>> > > > [ ] -1 disapprove with the reason >>>> > > > >>>> > > > >>>> > > > >>>> > > > I hereby already vote my +1 binding :) >>>> > > > >>>> > > > >>>> > > > >>>> > > > Regards, >>>> > > > >>>> > > > David aka dabla >>>> > > > >>>> > > >>>> > >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [email protected] >>>> For additional commands, e-mail: [email protected] >>>> >>>
