I've been working on making Deadline callbacks work on the executor in addition to the Triggerer. That means introducing a new Workload type for the executor. I didn't want to tie it specifically to Deadlines and realized it would be a good idea to introduce generic Callbacks that the new workload can reference.
Here's an initial PR for adding a new Callback table and refactoring existing Deadline callbacks that run on the Triggerer to use these instead: https://github.com/apache/airflow/pull/54796 Ideally, once AIP-92 is in progress, these callbacks can be used or all the `on_*_callback`s as well. I want to ensure that these are generic enough for additional subclasses that would implement them. So, I'm looking for feedback/comments to align on a common definition for the new Workload type and the Callback model definition. Thanks, Ramit On 2025-08-13, 1:39 AM, "Jarek Potiuk" <ja...@potiuk.com <mailto:ja...@potiuk.com>> wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le contenu ne présente aucun risque. Those are all questions that we will open up for discussion once we get the basic foundation :) On Wed, Aug 13, 2025 at 10:16 AM Sumit Maheshwari <sumeet.ma...@gmail.com <mailto:sumeet.ma...@gmail.com>> wrote: > Awesome, having a standard approach for all kinds of authentication would > be great, looking forward to it. > > BTW, on a side note, I see that as of now, things like Connections, > Variables, and XComs which are present under Execution API namespace, don't > have any authorization model (left with TODOs), so is there any plan how > they will work, cause we might need something similar for > Dag-processor/Triggerer as well. > > Also, as we've decided to create a diff API namespace and not use execution > namespace for hosting new APIs required by dag-processor/triggerer, do > we've to copy these classes & routes OR atleast refactor them, so they can > serve all API namespaces. > > On Wed, Aug 13, 2025 at 1:27 PM Jarek Potiuk <ja...@potiuk.com > <mailto:ja...@potiuk.com>> wrote: > > > Will do. We are also discussing - for now - within the security team - > the > > various aspects of authentication approach we want to have - for both > > "UI/User authentication" as well as "Long running services" and how they > > relate to token exchange, invalidation and other scenarios. What we will > > come up - I hope shortly - is a proposal of general "model" of all kinds > of > > security and authentication scenarios, so that we do not have to reinvent > > the wheel and try to figure out all the aspects with individual AIPs. We > > are not far from bringing it to the open discussion at devlist, but we > have > > to be careful about some of the aspects that we might need to improve in > > the current setup to close some - small - loopholes so bear with us :) > > > > On Wed, Aug 13, 2025 at 9:33 AM Sumit Maheshwari <sumeet.ma...@gmail.com > > <mailto:sumeet.ma...@gmail.com> > > > > wrote: > > > > > Thanks Ash and Jarek, for the detailed comments. I largely agree with > the > > > points mentioned by you guys, hence I updated the AIP and added a > section > > > on Authentication, API Versioning, and Packaging as well. Please go > > through > > > it once more and let me know if there are more things to consider > before > > I > > > open it for voting. > > > > > > > > > On Thu, Aug 7, 2025 at 6:03 PM Jarek Potiuk <ja...@potiuk.com > > > <mailto:ja...@potiuk.com>> wrote: > > > > > > > Also > > > > > > > > > 1. The Authentication token. How will this long lived token work > > > without > > > > being insecure. Who and what will generate it? How will we identify > > > > top-level requests for Variables in order to be able to add Variable > > > > RBAC/ACLs. This is an important enough thing that I think it needs > > > > discussion before we vote on this AIP. > > > > > > > > We are currently discussing - in the security team - approach for JWT > > > token > > > > handling, so likely we could move the discussion there, it does have > > some > > > > security implications and I think we should bring our finding to the > > > > devlist when we complete it, but I think we should add this case > there. > > > > IMHO we should have a different approach for UI, different for Tasks, > > > > different for Triggerer, and different for DagProcessor. (possibly > the > > > > Trigerer and DagProcessor could be the same because they share > > > essentially > > > > the same long-living token. Ash - I will add this to the discussion > > > there. > > > > > > > > J. > > > > > > > > > > > > > > > > On Thu, Aug 7, 2025 at 2:23 PM Jarek Potiuk <ja...@potiuk.com > > > > <mailto:ja...@potiuk.com>> > wrote: > > > > > > > > > Ah.. So if we are talking about a more complete approach - seeing > > those > > > > > comments from Ash - make me think if we should have another AIP. > > > > > (connected) about splitting the distributions. We have never > > finalized > > > it > > > > > (nor even discussed id) but Ash - you had some initial document for > > > that. > > > > > So maybe we should finalize it and rather than specify it in this > > AIP - > > > > > have a separate AIP about distribution split that AIP-92 could > depend > > > on. > > > > > It seems much more reasonable to split "distribution and code > split" > > > from > > > > > parsing isolation I think and implement them separately/in > parallel. > > > > > > > > > > Reading Ash comments (and maybe I am going a bit further than Ash) > it > > > > > calls for something that I am a big proponent of - splitting > > > > "airflow-core" > > > > > and having a different "scheduler". "webserver", "dag processor" > and > > > > > "triggerer" distributions. Now - we have the capability of having > > > > "shared" > > > > > code - we do not need "common" code to make it happen - because we > > can > > > > > share code. > > > > > > > > > > What it could give us - on top of clean client/server split, we > could > > > > have > > > > > different dependencies used by those distributions. Additionally, > we > > > > could > > > > > also split-off the executors from providers and finally implement > it > > in > > > > the > > > > > way that scheduler does not use providers at all (not even > > > > cncf.kubernetes > > > > > nor celery providers installed in scheduler nor webserver but > > > "executors" > > > > > packages instead. The code sharing approach with symlinks we have > now > > > > will > > > > > make it a .... breeze :) . That would also imply sharing > "connection" > > > > > definitions through DB, and likely implementing "test connection" > > > > > feature properly finally (i.e executing test connection in worker / > > > > > triggerer rather than in web server which is a reason why we > disabled > > > it > > > > by > > > > > default now). This way "api-server" would not need any of the > > providers > > > > to > > > > > be installed either which IMHO is the biggest win from a security > > point > > > > of > > > > > view. > > > > > > > > > > And the nice thing about it is that it would be rather transparent > > when > > > > > anyone uses "pip install apache-airflow" - it would behave exactly > > the > > > > > same, no more complexity involved, simply more distributions > > installed > > > > when > > > > > 'apache-airflow" meta-distribution is used, but it would allow > those > > > who > > > > > want to implement a more complex and secure setup to have different > > > > > "environments" with modularized pieces of airflow installed - only > > > > > "apache-airflow-dag-processor + task-sdk + providers" where > > > dag-processor > > > > > is run, only "apache-airflow-scheduler + executors" where scheduler > > is > > > > > installed only "apache-airflow-task-sdk + providers" where workers > > are > > > > > running, only "apache-airflow-api-server" where api-server is > running > > > and > > > > > only "apache-airflow-trigger + task-sdk + providers" . > > > > > > > > > > I am happy (Ash If you are fine with that) to take that original > > > document > > > > > over and lead this part and new AIP to completion (including > > > > > implementation), I am very much convinced that this will lead to > much > > > > > better dependency security and more modular code without impacting > > the > > > > > "apache-airflow" installation complexity. > > > > > > > > > > If we do it this way- the part of code/clean split would be > > "delegated > > > > > out" from AIP-92 to this new AIP and turned into dependency. > > > > > > > > > > J. > > > > > > > > > > > > > > > On Thu, Aug 7, 2025 at 1:51 PM Ash Berlin-Taylor <a...@apache.org > > > > > <mailto:a...@apache.org>> > > > wrote: > > > > > > > > > >> This AIP is definitely heading in the right direction and is a > > feature > > > > >> I’d like to see. > > > > >> > > > > >> For me the outstanding things that need more detail: > > > > >> > > > > >> 1. The Authentication token. How will this long lived token work > > > without > > > > >> being insecure. Who and what will generate it? How will we > identify > > > > >> top-level requests for Variables in order to be able to add > Variable > > > > >> RBAC/ACLs. This is an important enough thing that I think it needs > > > > >> discussion before we vote on this AIP. > > > > >> 2. Security generally — how will this work, especially with the > > > > >> multi-team? I think this likely means making the APIs work on the > > > bundle > > > > >> level as you mention in the doc, but I haven’t thought deeply > about > > > this > > > > >> yet. > > > > >> 3. API Versioning? One of the the key driving goals with AIP-72 > and > > > the > > > > >> Task Execution SDK was the idea that “you can upgrade the API > server > > > as > > > > you > > > > >> like, and your clients/workers never need to work” — i.e. the API > > > > server is > > > > >> 100% working with all older versions of the TaskSDK. I don’t know > if > > > we > > > > >> will achieve that goal in the long run but it is the desire, and > > part > > > of > > > > >> why we are using CalVer and the Cadwyn library to provide API > > > > versioning. > > > > >> 4. As mentioned previously, not sure the existing serialised JSON > > > format > > > > >> for DAGs is correct, but since that now has version and we already > > > have > > > > the > > > > >> ability to upgrade that somewhere in the Airflow Core that doesn’t > > > > >> necessarily become a blocker/pre-requisite for this AIP. > > > > >> > > > > >> I think Dag parsing API client+submission+parsing process manager > > > should > > > > >> either live in the Task SDK dist, or in a new separate dist that > > uses > > > > >> TaskSDK, but crucially not in apache-airflow-core. My reason for > > this > > > is > > > > >> that I want it to be possible for the server components > (scheduler, > > > API > > > > >> server) to not need task-sdk installed (just for > > cleanliness/avoiding > > > > >> confusion about what versions it needs) and also vice-verse, to be > > > able > > > > to > > > > >> run a “team worker bundle” (Dag parsing, workers, triggered/async > > > > workers) > > > > >> on whatever version of TaskSDK they choose, again without > > > > >> apache-airflow-core installed for avoidance of doubt. > > > > >> > > > > >> Generally I would like this as it means we can have a nicer > > separation > > > > of > > > > >> Core and Dag parsing code, as the dag parsing itself uses the SDK, > > it > > > > would > > > > >> be nice to have a proper server/client split, both from a tighter > > > > security > > > > >> point-of-view, but also from a code layout point of view. > > > > >> > > > > >> -ash > > > > >> > > > > >> > > > > >> > On 7 Aug 2025, at 12:36, Jarek Potiuk <ja...@potiuk.com > > > > >> > <mailto:ja...@potiuk.com>> wrote: > > > > >> > > > > > >> > Well, you started it - so it's up to you to decide if you think > we > > > > have > > > > >> > consensus, or whether we need a vote. > > > > >> > > > > > >> > And It's not a question of "informal" vote but it's rather clear > > > > >> following > > > > >> > the https://www.apache.org/foundation/voting.html > > > > >> > <https://www.apache.org/foundation/voting.html> that we > either > > > > need a > > > > >> > LAZY CONSENSUS or VOTE thread. Both are formal. > > > > >> > > > > > >> > This is the difficult part when you have a proposal, to assess > (by > > > > you) > > > > >> > whether we are converging to consensus or whether vote is > needed. > > > > There > > > > >> is > > > > >> > no other body or "authority" to do it for you. > > > > >> > > > > > >> > J. > > > > >> > > > > > >> > On Thu, Aug 7, 2025 at 1:02 PM Sumit Maheshwari < > > > > sumeet.ma...@gmail.com <mailto:sumeet.ma...@gmail.com> > > > > >> > > > > > >> > wrote: > > > > >> > > > > > >> >> Sorry for nudging again, but can we get into some consensus on > > > this? > > > > I > > > > >> mean > > > > >> >> if this AIP isn't good enough, then we can drop it altogether > and > > > > >> someone > > > > >> >> can rethink the whole thing. Should we do some kind of informal > > > > voting > > > > >> and > > > > >> >> close this thread? > > > > >> >> > > > > >> >> On Mon, Aug 4, 2025 at 3:32 PM Jarek Potiuk <ja...@potiuk.com > > > > >> >> <mailto:ja...@potiuk.com>> > > > > wrote: > > > > >> >> > > > > >> >>>>> My main concern with this right now is the serialisation > > format > > > of > > > > >> >> DAGs > > > > >> >>> — > > > > >> >>>> it wasn’t really designed with remote submission in mind, so > it > > > > need > > > > >> >> some > > > > >> >>>> careful examination to see if it is fit for this purpose or > > not. > > > > >> >>> > > > > >> >>> I understand Ash's concerns - the format has not been designed > > > with > > > > >> >>> size/speed optimization in mind so **possibly** we could > design > > a > > > > >> >> different > > > > >> >>> format that would be better suited. > > > > >> >>> > > > > >> >>> BUT ... Done is better than perfect. > > > > >> >>> > > > > >> >>> I think there are a number of risks involved in changing the > > > format > > > > >> and > > > > >> >> it > > > > >> >>> could significantly increase time of development with > uncertain > > > > gains > > > > >> at > > > > >> >>> the end - also because of the progress in compression that > > > happened > > > > >> over > > > > >> >>> the last few years. > > > > >> >>> > > > > >> >>> It might be a good idea to experiment a bit with different > > > > compression > > > > >> >>> algorithms for "our" dag representation and possibly we could > > find > > > > the > > > > >> >> best > > > > >> >>> algorithm for "airflow dag" type of json data. There are a lot > > of > > > > >> >>> repetitions in the JSON representation and I guess in "our" > json > > > > >> >>> representation there are some artifacts and repeated sections > > that > > > > >> simply > > > > >> >>> might compress well with different algorithms. Also in this > case > > > > >> >>> speed matters (and CPU trade-off). > > > > >> >>> > > > > >> >>> Looking at compression "theory" - before we experiment with > it - > > > > >> there is > > > > >> >>> the relatively new standard "zstandard" > > > > >> https://github.com/facebook/zstd <https://github.com/facebook/zstd> > > > > >> >>> compression opensourced in 2016 which I've heard good things > > > about - > > > > >> >>> especially that it maintains a very good compression rate for > > text > > > > >> data, > > > > >> >>> but also it is tremendously fast - especially for > decompression > > > > >> (which is > > > > >> >>> super important factor for us - we compress new DAG > > representation > > > > far > > > > >> >> less > > > > >> >>> often than decompress it in general case). It is standardized > in > > > RFC > > > > >> >>> https://datatracker.ietf.org/doc/html/rfc8878 > > > > >> >>> <https://datatracker.ietf.org/doc/html/rfc8878> and there are > > > various > > > > >> >>> implementations and it is even being added to Python standard > > > > library > > > > >> in > > > > >> >>> Python 3.14 > > > > >> https://docs.python.org/3.14/library/compression.zstd.html > > > > >> <https://docs.python.org/3.14/library/compression.zstd.html> > > > > >> >> and > > > > >> >>> there is a very well maintained python binding library > > > > >> >>> https://pypi.org/project/zstd/ <https://pypi.org/project/zstd/> > > > > >> >>> to Yann Collet (algorithm > > author) > > > > >> ZSTD C > > > > >> >>> library. And libzstd is already part of our images - it is > > needed > > > by > > > > >> >> other > > > > >> >>> dependencies of ours. All with BSD licence, directly usable by > > us. > > > > >> >>> > > > > >> >>> I think this one might be a good candidate for us to try, and > > > > possibly > > > > >> >> with > > > > >> >>> zstd we could achieve both size and CPU overhead that would be > > > > >> comparable > > > > >> >>> with any "new" format we could come up with - especially that > we > > > are > > > > >> >>> talking merely about processing a huge blob between "storable" > > > > >> >> (compressed) > > > > >> >>> and "locally usable" state (Python dict). We could likely use > a > > > > >> streaming > > > > >> >>> JSON library (say the one that is used in Pydantic internally > > > > >> >>> https://github.com/pydantic/jiter > > > > >> >>> <https://github.com/pydantic/jiter> - we already have it as > part > > of > > > > >> >>> Pydantic) > > > > >> >>> to also save memory - we could stream decompressed stream into > > > > jitter > > > > >> so > > > > >> >>> that both the json dict and string representation does not > have > > to > > > > be > > > > >> >>> loaded fully in memory at the same time. There are likely lots > > of > > > > >> >>> optimisations we could do - I mentioned possibly streaming the > > > data > > > > >> from > > > > >> >>> API directly to DB (if this is possible - not sure) > > > > >> >>> > > > > >> >>> J. > > > > >> >>> > > > > >> >>> > > > > >> >>> On Mon, Aug 4, 2025 at 9:10 AM Sumit Maheshwari < > > > > >> sumeet.ma...@gmail.com <mailto:sumeet.ma...@gmail.com>> > > > > >> >>> wrote: > > > > >> >>> > > > > >> >>>>> > > > > >> >>>>> My main concern with this right now is the serialisation > > format > > > of > > > > >> >>> DAGs — > > > > >> >>>>> it wasn’t really designed with remote submission in mind, so > > it > > > > need > > > > >> >>> some > > > > >> >>>>> careful examination to see if it is fit for this purpose or > > not. > > > > >> >>>>> > > > > >> >>>> > > > > >> >>>> I'm not sure on this point, cause if we are able to convert a > > DAG > > > > >> into > > > > >> >>>> JSON, then it has to be transferable over the internet. > > > > >> >>>> > > > > >> >>>> In particular One of the things I worry about is that the > JSON > > > can > > > > >> get > > > > >> >>> huge > > > > >> >>>>> — I’ve seem this as large as 10-20Mb for some dags > > > > >> >>>> > > > > >> >>>> > > > > >> >>>> Yeah, agree on this, thats why we can transfer compressed > data > > > > >> instead > > > > >> >> of > > > > >> >>>> real json. Of course, this won't guarantee that the payload > > will > > > > >> always > > > > >> >>> be > > > > >> >>>> small enough, but we can't say that it'll definitely happen > > > either. > > > > >> >>>> > > > > >> >>>> I also wonder if as part of this proposal we should move the > > > > Callback > > > > >> >>>>> requests off the dag parsers and on to the workers instead > > > > >> >>>> > > > > >> >>>> let's make such a "workfload" implementation stream that > could > > > > >> support > > > > >> >>> both > > > > >> >>>>> - Deadlines and DAG parsing logic > > > > >> >>>> > > > > >> >>>> > > > > >> >>>> I don't have any strong opinion here, but it feels like it's > > > gonna > > > > >> blow > > > > >> >>> up > > > > >> >>>> the scope of the AIP too much. > > > > >> >>>> > > > > >> >>>> > > > > >> >>>> On Fri, Aug 1, 2025 at 2:27 AM Jarek Potiuk < > ja...@potiuk.com <mailto:ja...@potiuk.com>> > > > > >> wrote: > > > > >> >>>> > > > > >> >>>>>> My main concern with this right now is the serialisation > > format > > > > of > > > > >> >>>> DAGs — > > > > >> >>>>> it wasn’t really designed with remote submission in mind, so > > it > > > > need > > > > >> >>> some > > > > >> >>>>> careful examination to see if it is fit for this purpose or > > not. > > > > >> >>>>> > > > > >> >>>>> Yep. That might be potentially a problem (or at least "need > > more > > > > >> >>>> resources > > > > >> >>>>> to run airflow") and that is where my "2x memory" came from > if > > > we > > > > do > > > > >> >> it > > > > >> >>>> in > > > > >> >>>>> a trivial way. Currently we a) keep the whole DAG in memory > > when > > > > >> >>>>> serializing it b) submit it to database (also using > > essentially > > > > some > > > > >> >>> kind > > > > >> >>>>> of API (implemented by the database client) - so we know the > > > whole > > > > >> >>> thing > > > > >> >>>>> "might work" but indeed if you use a trivial implementation > of > > > > >> >>> submitting > > > > >> >>>>> the whole json - it basically means that the whole json will > > > have > > > > to > > > > >> >>> also > > > > >> >>>>> be kept in the memory of API server. But we also compress it > > > when > > > > >> >>> needed > > > > >> >>>> - > > > > >> >>>>> I wonder what are the compression ratios we saw with those > > > > 10-20MBs > > > > >> >>> Dags > > > > >> >>>> - > > > > >> >>>>> if the problem is using strings where bool would suffice, > > > > >> compression > > > > >> >>>>> should generally help a lot. We could only ever send > > compressed > > > > data > > > > >> >>> over > > > > >> >>>>> the API - there seems to be no need to send "plain JSON" > data > > > over > > > > >> >> the > > > > >> >>>> API > > > > >> >>>>> or storing the plain JSON in the DB (of course that trades > > > memory > > > > >> for > > > > >> >>>> CPU). > > > > >> >>>>> > > > > >> >>>>> I wonder if sqlalchemy 2 (and drivers for MySQL/Postgres) > have > > > > >> >> support > > > > >> >>>> for > > > > >> >>>>> any kind if binary data streaming - because that could help > a > > > lot > > > > of > > > > >> >> if > > > > >> >>>> we > > > > >> >>>>> could use streaming HTTP API and chunk and append the binary > > > > chunks > > > > >> >>> (when > > > > >> >>>>> writing) - or read data in chunks ans stream them back via > the > > > > API. > > > > >> >>> That > > > > >> >>>>> could seriously decrease the amount of memory needed by the > > API > > > > >> >> server > > > > >> >>> to > > > > >> >>>>> process such huge serialized dags. > > > > >> >>>>> > > > > >> >>>>> And yeah - I would also love the "execute task" to be > > > implemented > > > > >> >> here > > > > >> >>> - > > > > >> >>>>> but I am not sure if this should be part of the same effort > or > > > > maybe > > > > >> >> a > > > > >> >>>>> separate implementation? That sounds very loosely coupled > with > > > DB > > > > >> >>>>> isolation. And it seems a common theme - I think that would > > also > > > > >> make > > > > >> >>> the > > > > >> >>>>> sync Deadline alerts case that we discussed at the dev call > > > > today. I > > > > >> >>>> wonder > > > > >> >>>>> if that should not be kind of parallel (let's make such a > > > > >> "workfload" > > > > >> >>>>> implementation stream that could support both - Deadlines > and > > > DAG > > > > >> >>> parsing > > > > >> >>>>> logic. We have already two "users" for it and I really love > > the > > > > >> >> saying > > > > >> >>>> "if > > > > >> >>>>> you want to make something reusable - make it usable > first" - > > > > seems > > > > >> >>> like > > > > >> >>>>> we might have good opportunity to make such workload > > > > implementation > > > > >> >>>> "doubly > > > > >> >>>>> used" from the beginning which would increase chances it > will > > > be > > > > >> >>>>> "reusable" for other things as well :). > > > > >> >>>>> > > > > >> >>>>> J. > > > > >> >>>>> > > > > >> >>>>> > > > > >> >>>>> On Thu, Jul 31, 2025 at 12:28 PM Ash Berlin-Taylor < > > > > a...@apache.org <mailto:a...@apache.org>> > > > > >> >>>> wrote: > > > > >> >>>>> > > > > >> >>>>>> My main concern with this right now is the serialisation > > format > > > > of > > > > >> >>>> DAGs — > > > > >> >>>>>> it wasn’t really designed with remote submission in mind, > so > > it > > > > >> >> need > > > > >> >>>> some > > > > >> >>>>>> careful examination to see if it is fit for this purpose or > > > not. > > > > >> >>>>>> > > > > >> >>>>>> In particular One of the things I worry about is that the > > JSON > > > > can > > > > >> >>> get > > > > >> >>>>>> huge — I’ve seem this as large as 10-20Mb for some dags(!!) > > > > (which > > > > >> >> is > > > > >> >>>>>> likely due to things being included as text when a bool > might > > > > >> >>> suffice, > > > > >> >>>>> for > > > > >> >>>>>> example) But I don’t think “just submit the existing JSON > > over > > > an > > > > >> >>> API” > > > > >> >>>>> is a > > > > >> >>>>>> good idea. > > > > >> >>>>>> > > > > >> >>>>>> I also wonder if as part of this proposal we should move > the > > > > >> >> Callback > > > > >> >>>>>> requests off the dag parsers and on to the workers instead > — > > in > > > > >> >>> AIP-72 > > > > >> >>>> we > > > > >> >>>>>> introduced the concept of a Workload, with the only one > > > existing > > > > >> >>> right > > > > >> >>>>> now > > > > >> >>>>>> is “ExecuteTask” > > > > >> >>>>>> > > > > >> >>>>> > > > > >> >>>> > > > > >> >>> > > > > >> >> > > > > >> > > > > > > > > > > https://github.com/apache/airflow/blob/8e1201c7713d5c677fa6f6d48bbd4f6903505f61/airflow-core/src/airflow/executors/workloads.py#L87-L88 > > <https://github.com/apache/airflow/blob/8e1201c7713d5c677fa6f6d48bbd4f6903505f61/airflow-core/src/airflow/executors/workloads.py#L87-L88> > > > > >> >>>>>> — it might be time to finally move task and dag callbacks > to > > > the > > > > >> >> same > > > > >> >>>>> thing > > > > >> >>>>>> and make dag parsers only responsible for, well, parsing. > :) > > > > >> >>>>>> > > > > >> >>>>>> These are all solvable problems, and this will be a great > > > feature > > > > >> >> to > > > > >> >>>>> have, > > > > >> >>>>>> but we need to do some more thinking and planning first. > > > > >> >>>>>> > > > > >> >>>>>> -ash > > > > >> >>>>>> > > > > >> >>>>>>> On 31 Jul 2025, at 10:12, Sumit Maheshwari < > > > > >> >> sumeet.ma...@gmail.com <mailto:sumeet.ma...@gmail.com> > > > > >> >>>> > > > > >> >>>>>> wrote: > > > > >> >>>>>>> > > > > >> >>>>>>> Gentle reminder for everyone to review the proposal. > > > > >> >>>>>>> > > > > >> >>>>>>> Updated link: > > > > >> >>>>>>> > > > > >> >>>>>> > > > > >> >>>>> > > > > >> >>>> > > > > >> >>> > > > > >> >> > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-92+Isolate+DAG+processor%2C+Callback+processor%2C+and+Triggerer+from+core+services > > <https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-92+Isolate+DAG+processor%2C+Callback+processor%2C+and+Triggerer+from+core+services> > > > > >> >>>>>>> > > > > >> >>>>>>> On Tue, Jul 29, 2025 at 4:37 PM Sumit Maheshwari < > > > > >> >>>>> sumeet.ma...@gmail.com <mailto:sumeet.ma...@gmail.com> > > > > >> >>>>>>> > > > > >> >>>>>>> wrote: > > > > >> >>>>>>> > > > > >> >>>>>>>> Thanks everyone for reviewing this AIP. As Jarek and > others > > > > >> >>>>> suggested, I > > > > >> >>>>>>>> expanded the scope of this AIP and divided it into three > > > > phases. > > > > >> >>>> With > > > > >> >>>>>> the > > > > >> >>>>>>>> increased scope, the boundary line between this AIP and > > > AIP-85 > > > > >> >>> got a > > > > >> >>>>>> little > > > > >> >>>>>>>> thinner, but I believe these are still two different > > > > >> >> enhancements > > > > >> >>> to > > > > >> >>>>>> make. > > > > >> >>>>>>>> > > > > >> >>>>>>>> > > > > >> >>>>>>>> > > > > >> >>>>>>>> On Fri, Jul 25, 2025 at 10:51 PM Sumit Maheshwari < > > > > >> >>>>>> sumeet.ma...@gmail.com <mailto:sumeet.ma...@gmail.com>> > > > > >> >>>>>>>> wrote: > > > > >> >>>>>>>> > > > > >> >>>>>>>>> Yeah, overall it makes sense to include Triggers as well > > to > > > be > > > > >> >>> part > > > > >> >>>>> of > > > > >> >>>>>>>>> this AIP and phase out the implementation. Though I > didn't > > > > >> >>> exclude > > > > >> >>>>>> Triggers > > > > >> >>>>>>>>> because "Uber" doesn't need that, I just thought of > > keeping > > > > the > > > > >> >>>> scope > > > > >> >>>>>> of > > > > >> >>>>>>>>> development small and achieving them, just like it was > > done > > > in > > > > >> >>>>> Airlfow > > > > >> >>>>>> 3 by > > > > >> >>>>>>>>> secluding only workers and not DAG-processor & Triggers. > > > > >> >>>>>>>>> > > > > >> >>>>>>>>> But if you think Triggers should be part of this AIP > > itself, > > > > >> >>> then I > > > > >> >>>>> can > > > > >> >>>>>>>>> do that and include Triggers as well in it. > > > > >> >>>>>>>>> > > > > >> >>>>>>>>> On Fri, Jul 25, 2025 at 7:34 PM Jarek Potiuk < > > > > ja...@potiuk.com <mailto:ja...@potiuk.com> > > > > >> >>> > > > > >> >>>>> wrote: > > > > >> >>>>>>>>> > > > > >> >>>>>>>>>> I would very much prefer the architectural choices of > > this > > > > AIP > > > > >> >>> are > > > > >> >>>>>> based > > > > >> >>>>>>>>>> on > > > > >> >>>>>>>>>> "general public" needs rather than "Uber needs" even if > > > Uber > > > > >> >>> will > > > > >> >>>> be > > > > >> >>>>>>>>>> implementing it - so from my point of view having > Trigger > > > > >> >>>> separation > > > > >> >>>>>> as > > > > >> >>>>>>>>>> part of it is quite important. > > > > >> >>>>>>>>>> > > > > >> >>>>>>>>>> But that's not even this. > > > > >> >>>>>>>>>> > > > > >> >>>>>>>>>> We've been discussing for example for Deadlines (being > > > > >> >>> implemented > > > > >> >>>>> by > > > > >> >>>>>>>>>> Dennis and Ramit a possibility of short, > > > notification-style > > > > >> >>>>>> "deadlines" > > > > >> >>>>>>>>>> to be send to triggerer for execution - this is well > > > advanced > > > > >> >>> now, > > > > >> >>>>> and > > > > >> >>>>>>>>>> whether you want it or not Dag-provided code might be > > > > >> >> serialized > > > > >> >>>> and > > > > >> >>>>>> sent > > > > >> >>>>>>>>>> to triggerer for execution. This is part of our > "broader" > > > > >> >>>>>> architectural > > > > >> >>>>>>>>>> change where we treat "workers" and "triggerer" > similarly > > > as > > > > a > > > > >> >>>>> general > > > > >> >>>>>>>>>> executors of "sync" and "async" tasks respectively. > > That's > > > > >> >> where > > > > >> >>>>>> Airflow > > > > >> >>>>>>>>>> is > > > > >> >>>>>>>>>> evolving towards - inevitably. > > > > >> >>>>>>>>>> > > > > >> >>>>>>>>>> But we can of course phase things in out for > > > implementation - > > > > >> >>> even > > > > >> >>>>> if > > > > >> >>>>>> AIP > > > > >> >>>>>>>>>> should cover both, I think if the goal of the AIP and > > > > preamble > > > > >> >>> is > > > > >> >>>>>> about > > > > >> >>>>>>>>>> separating "user code" from "database" as the main > > reason, > > > it > > > > >> >>> also > > > > >> >>>>>> means > > > > >> >>>>>>>>>> Triggerer if you ask me (from design point of view at > > > least). > > > > >> >>>>>>>>>> > > > > >> >>>>>>>>>> Again implementation can be phased and even different > > > people > > > > >> >> and > > > > >> >>>>> teams > > > > >> >>>>>>>>>> might work on those phases/pieces. > > > > >> >>>>>>>>>> > > > > >> >>>>>>>>>> J. > > > > >> >>>>>>>>>> > > > > >> >>>>>>>>>> On Fri, Jul 25, 2025 at 2:29 PM Sumit Maheshwari < > > > > >> >>>>>> sumeet.ma...@gmail.com <mailto:sumeet.ma...@gmail.com> > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>> wrote: > > > > >> >>>>>>>>>> > > > > >> >>>>>>>>>>>> > > > > >> >>>>>>>>>>>>> #2. Yeah, we would need something similar for > > triggerers > > > > as > > > > >> >>>> well, > > > > >> >>>>>>>>>> but > > > > >> >>>>>>>>>>>> that > > > > >> >>>>>>>>>>>> can be done as part of a different AIP > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>>> You won't achieve your goal of "true" isolation of > user > > > code > > > > >> >> if > > > > >> >>>> you > > > > >> >>>>>>>>>> don't > > > > >> >>>>>>>>>>>> do triggerer. I think if the goal is to achieve it - > it > > > > >> >> should > > > > >> >>>>> cover > > > > >> >>>>>>>>>>> both. > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>>> My bad, should've explained our architecture for > > triggers > > > as > > > > >> >>>> well, > > > > >> >>>>>>>>>>> apologies. So here it is: > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>>> - Triggers would be running on a centralized service, > > so > > > > >> >> all > > > > >> >>>> the > > > > >> >>>>>>>>>> Trigger > > > > >> >>>>>>>>>>> classes will be part of the platform team's repo and > > not > > > > >> >> the > > > > >> >>>>>>>>>> customer's > > > > >> >>>>>>>>>>> repo > > > > >> >>>>>>>>>>> - The triggers won't be able to use any libs other > than > > > std > > > > >> >>>> ones, > > > > >> >>>>>>>>>> which > > > > >> >>>>>>>>>>> are being used in core Airflow (like requests, etc) > > > > >> >>>>>>>>>>> - As we are the owners of the core Airflow repo, > > > customers > > > > >> >>> have > > > > >> >>>>> to > > > > >> >>>>>>>>>> get > > > > >> >>>>>>>>>>> our approval to land any class in this path (unlike > the > > > > >> >> dags > > > > >> >>>> repo > > > > >> >>>>>>>>>> which > > > > >> >>>>>>>>>>> they own) > > > > >> >>>>>>>>>>> - When a customer's task defer, we would have an > > > allowlist > > > > >> >> on > > > > >> >>>> our > > > > >> >>>>>>>>>> side > > > > >> >>>>>>>>>>> to check if we should do the async polling or not > > > > >> >>>>>>>>>>> - If the Trigger class isn't part of our repo > > > (allowlist), > > > > >> >>> just > > > > >> >>>>>>>>>> fail the > > > > >> >>>>>>>>>>> task, as anyway we won't be having the code that they > > > used > > > > >> >> in > > > > >> >>>> the > > > > >> >>>>>>>>>>> trigger > > > > >> >>>>>>>>>>> class > > > > >> >>>>>>>>>>> - If any of these conditions aren't suitable for you > > (as > > > a > > > > >> >>>>>>>>>> customer), > > > > >> >>>>>>>>>>> feel free to use sync tasks only > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>>> But in general, I agree to make triggerer svc also > > > > >> >> communicate > > > > >> >>>> over > > > > >> >>>>>>>>>> apis > > > > >> >>>>>>>>>>> only. If that is done, then we can have instances of > > > > >> >> triggerer > > > > >> >>>> svc > > > > >> >>>>>>>>>> running > > > > >> >>>>>>>>>>> at customer's side as well, which can process any type > > of > > > > >> >>> trigger > > > > >> >>>>>>>>>> class. > > > > >> >>>>>>>>>>> Though that's not a blocker for us at the moment, > cause > > > > >> >>> triggerer > > > > >> >>>>> are > > > > >> >>>>>>>>>>> mostly doing just polling using simple libs like > > requests. > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>>> On Fri, Jul 25, 2025 at 5:03 PM Igor Kholopov > > > > >> >>>>>>>>>> <ikholo...@google.com.inva > > > > >> >>>>>>>>>> <mailto:ikholo...@google.com.inva>lid > > > > >> >>>>>>>>>>>> > > > > >> >>>>>>>>>>> wrote: > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>>>> Thanks Sumit for the detailed proposal. Overall I > > believe > > > > it > > > > >> >>>>> aligns > > > > >> >>>>>>>>>> well > > > > >> >>>>>>>>>>>> with the goals of making Airflow well-scalable > beyond a > > > > >> >>>>> single-team > > > > >> >>>>>>>>>>>> deployment (and AIP-85 goals), so you have my full > > > support > > > > >> >>> with > > > > >> >>>>> this > > > > >> >>>>>>>>>> one. > > > > >> >>>>>>>>>>>> > > > > >> >>>>>>>>>>>> I've left a couple of clarification requests on the > AIP > > > > >> >> page. > > > > >> >>>>>>>>>>>> > > > > >> >>>>>>>>>>>> Thanks, > > > > >> >>>>>>>>>>>> Igor > > > > >> >>>>>>>>>>>> > > > > >> >>>>>>>>>>>> On Fri, Jul 25, 2025 at 11:50 AM Sumit Maheshwari < > > > > >> >>>>>>>>>>> sumeet.ma...@gmail.com <mailto:sumeet.ma...@gmail.com>> > > > > >> >>>>>>>>>>>> wrote: > > > > >> >>>>>>>>>>>> > > > > >> >>>>>>>>>>>>> Thanks Jarek and Ash, for the initial review. It's > > good > > > to > > > > >> >>> know > > > > >> >>>>>>>>>> that > > > > >> >>>>>>>>>>> the > > > > >> >>>>>>>>>>>>> DAG processor has some preemptive measures in place > to > > > > >> >>> prevent > > > > >> >>>>>>>>>> access > > > > >> >>>>>>>>>>>>> to the DB. However, the main issue we are trying to > > > solve > > > > >> >> is > > > > >> >>>> not > > > > >> >>>>> to > > > > >> >>>>>>>>>>>> provide > > > > >> >>>>>>>>>>>>> DB creds to the customer teams, who are using > Airflow > > > as a > > > > >> >>>>>>>>>> multi-tenant > > > > >> >>>>>>>>>>>>> orchestration platform. I've updated the doc to > > reflect > > > > >> >> this > > > > >> >>>>> point > > > > >> >>>>>>>>>> as > > > > >> >>>>>>>>>>>> well. > > > > >> >>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>> Answering Jarek's points, > > > > >> >>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>> #1. Yeah, had forgot to write about token mechanism, > > > added > > > > >> >>> that > > > > >> >>>>> in > > > > >> >>>>>>>>>> doc, > > > > >> >>>>>>>>>>>> but > > > > >> >>>>>>>>>>>>> still how the token can be obtained (safely) is > still > > > open > > > > >> >> in > > > > >> >>>> my > > > > >> >>>>>>>>>> mind. > > > > >> >>>>>>>>>>> I > > > > >> >>>>>>>>>>>>> believe the token used by task executors can be > > created > > > > >> >>> outside > > > > >> >>>>> of > > > > >> >>>>>>>>>> it > > > > >> >>>>>>>>>>> as > > > > >> >>>>>>>>>>>>> well (I may be wrong here). > > > > >> >>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>> #2. Yeah, we would need something similar for > > triggerers > > > > as > > > > >> >>>> well, > > > > >> >>>>>>>>>> but > > > > >> >>>>>>>>>>>> that > > > > >> >>>>>>>>>>>>> can be done as part of a different AIP > > > > >> >>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>> #3. Yeah, I also believe the API should work > largely. > > > > >> >>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>> #4. Added that in the AIP, that instead of dag_dirs > we > > > can > > > > >> >>> work > > > > >> >>>>>>>>>> with > > > > >> >>>>>>>>>>>>> dag_bundles and every dag-processor instance would > be > > > > >> >> treated > > > > >> >>>> as > > > > >> >>>>> a > > > > >> >>>>>>>>>> diff > > > > >> >>>>>>>>>>>>> bundle. > > > > >> >>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>> Also, added points around callbacks, as these are > also > > > > >> >>> fetched > > > > >> >>>>>>>>>> directly > > > > >> >>>>>>>>>>>>> from the DB. > > > > >> >>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>> On Fri, Jul 25, 2025 at 11:58 AM Jarek Potiuk < > > > > >> >>>> ja...@potiuk.com <mailto:ja...@potiuk.com>> > > > > >> >>>>>>>>>>> wrote: > > > > >> >>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>>>> A clarification to this - the dag parser today is > > > likely > > > > >> >>> not > > > > >> >>>>>>>>>>>> protection > > > > >> >>>>>>>>>>>>>> against a dedicated malicious DAG author, but it > does > > > > >> >>> protect > > > > >> >>>>>>>>>> against > > > > >> >>>>>>>>>>>>>> casual DB access attempts - the db session is > blanked > > > out > > > > >> >> in > > > > >> >>>> the > > > > >> >>>>>>>>>>>> parsing > > > > >> >>>>>>>>>>>>>> process , as are the env var configs > > > > >> >>>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>> > > > > >> >>>>>>>>>>>> > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>> > > > > >> >>>>>> > > > > >> >>>>> > > > > >> >>>> > > > > >> >>> > > > > >> >> > > > > >> > > > > > > > > > > https://github.com/apache/airflow/blob/main/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L274-L316 > > <https://github.com/apache/airflow/blob/main/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L274-L316> > > > > >> >>>>>>>>>>>>>> - > > > > >> >>>>>>>>>>>>>> is this perfect no? but it’s much more than no > > > protection > > > > >> >>>>>>>>>>>>>> Oh absolutely.. This is exactly what we discussed > > back > > > > >> >> then > > > > >> >>> in > > > > >> >>>>>>>>>> March > > > > >> >>>>>>>>>>> I > > > > >> >>>>>>>>>>>>>> think - and the way we decided to go for 3.0 with > > full > > > > >> >>>> knowledge > > > > >> >>>>>>>>>> it's > > > > >> >>>>>>>>>>>> not > > > > >> >>>>>>>>>>>>>> protecting against all threats. > > > > >> >>>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>>> On Fri, Jul 25, 2025 at 8:22 AM Ash Berlin-Taylor < > > > > >> >>>>>>>>>> a...@apache.org <mailto:a...@apache.org>> > > > > >> >>>>>>>>>>>>> wrote: > > > > >> >>>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>>>> A clarification to this - the dag parser today is > > > likely > > > > >> >>> not > > > > >> >>>>>>>>>>>> protection > > > > >> >>>>>>>>>>>>>>> against a dedicated malicious DAG author, but it > > does > > > > >> >>> protect > > > > >> >>>>>>>>>>> against > > > > >> >>>>>>>>>>>>>>> casual DB access attempts - the db session is > > blanked > > > > out > > > > >> >>> in > > > > >> >>>>>>>>>> the > > > > >> >>>>>>>>>>>>> parsing > > > > >> >>>>>>>>>>>>>>> process , as are the env var configs > > > > >> >>>>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>> > > > > >> >>>>>>>>>>>> > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>> > > > > >> >>>>>> > > > > >> >>>>> > > > > >> >>>> > > > > >> >>> > > > > >> >> > > > > >> > > > > > > > > > > https://github.com/apache/airflow/blob/main/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L274-L316 > > <https://github.com/apache/airflow/blob/main/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L274-L316> > > > > >> >>>>>>>>>>>>>>> - is this perfect no? but it’s much more than no > > > > >> >> protection > > > > >> >>>>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>>>>> On 24 Jul 2025, at 21:56, Jarek Potiuk < > > > > >> >> ja...@potiuk.com <mailto:ja...@potiuk.com>> > > > > >> >>>>>>>>>> wrote: > > > > >> >>>>>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>>>>> Currently in the DagFile processor there is no > > > > built-in > > > > >> >>>>>>>>>>> protection > > > > >> >>>>>>>>>>>>>>> against > > > > >> >>>>>>>>>>>>>>>> user code from Dag Parsing to - for example - > read > > > > >> >>> database > > > > >> >>>>>>>>>>>>>>>> credentials from airflow configuration and use > them > > > to > > > > >> >>> talk > > > > >> >>>>>>>>>> to DB > > > > >> >>>>>>>>>>>>>>> directly. > > > > >> >>>>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>>> > > > > >> >>>>>>>>>>>>> > > > > >> >>>>>>>>>>>> > > > > >> >>>>>>>>>>> > > > > >> >>>>>>>>>> > > > > >> >>>>>>>>> > > > > >> >>>>>> > > > > >> >>>>>> > > > > >> >>>>> > > > > >> >>>> > > > > >> >>> > > > > >> >> > > > > >> > > > > >> > > > > > > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org For additional commands, e-mail: dev-h...@airflow.apache.org