Thanks Ash and Jarek, for the detailed comments. I largely agree with the points mentioned by you guys, hence I updated the AIP and added a section on Authentication, API Versioning, and Packaging as well. Please go through it once more and let me know if there are more things to consider before I open it for voting.
On Thu, Aug 7, 2025 at 6:03 PM Jarek Potiuk <ja...@potiuk.com> wrote: > Also > > > 1. The Authentication token. How will this long lived token work without > being insecure. Who and what will generate it? How will we identify > top-level requests for Variables in order to be able to add Variable > RBAC/ACLs. This is an important enough thing that I think it needs > discussion before we vote on this AIP. > > We are currently discussing - in the security team - approach for JWT token > handling, so likely we could move the discussion there, it does have some > security implications and I think we should bring our finding to the > devlist when we complete it, but I think we should add this case there. > IMHO we should have a different approach for UI, different for Tasks, > different for Triggerer, and different for DagProcessor. (possibly the > Trigerer and DagProcessor could be the same because they share essentially > the same long-living token. Ash - I will add this to the discussion there. > > J. > > > > On Thu, Aug 7, 2025 at 2:23 PM Jarek Potiuk <ja...@potiuk.com> wrote: > > > Ah.. So if we are talking about a more complete approach - seeing those > > comments from Ash - make me think if we should have another AIP. > > (connected) about splitting the distributions. We have never finalized it > > (nor even discussed id) but Ash - you had some initial document for that. > > So maybe we should finalize it and rather than specify it in this AIP - > > have a separate AIP about distribution split that AIP-92 could depend on. > > It seems much more reasonable to split "distribution and code split" from > > parsing isolation I think and implement them separately/in parallel. > > > > Reading Ash comments (and maybe I am going a bit further than Ash) it > > calls for something that I am a big proponent of - splitting > "airflow-core" > > and having a different "scheduler". "webserver", "dag processor" and > > "triggerer" distributions. Now - we have the capability of having > "shared" > > code - we do not need "common" code to make it happen - because we can > > share code. > > > > What it could give us - on top of clean client/server split, we could > have > > different dependencies used by those distributions. Additionally, we > could > > also split-off the executors from providers and finally implement it in > the > > way that scheduler does not use providers at all (not even > cncf.kubernetes > > nor celery providers installed in scheduler nor webserver but "executors" > > packages instead. The code sharing approach with symlinks we have now > will > > make it a .... breeze :) . That would also imply sharing "connection" > > definitions through DB, and likely implementing "test connection" > > feature properly finally (i.e executing test connection in worker / > > triggerer rather than in web server which is a reason why we disabled it > by > > default now). This way "api-server" would not need any of the providers > to > > be installed either which IMHO is the biggest win from a security point > of > > view. > > > > And the nice thing about it is that it would be rather transparent when > > anyone uses "pip install apache-airflow" - it would behave exactly the > > same, no more complexity involved, simply more distributions installed > when > > 'apache-airflow" meta-distribution is used, but it would allow those who > > want to implement a more complex and secure setup to have different > > "environments" with modularized pieces of airflow installed - only > > "apache-airflow-dag-processor + task-sdk + providers" where dag-processor > > is run, only "apache-airflow-scheduler + executors" where scheduler is > > installed only "apache-airflow-task-sdk + providers" where workers are > > running, only "apache-airflow-api-server" where api-server is running and > > only "apache-airflow-trigger + task-sdk + providers" . > > > > I am happy (Ash If you are fine with that) to take that original document > > over and lead this part and new AIP to completion (including > > implementation), I am very much convinced that this will lead to much > > better dependency security and more modular code without impacting the > > "apache-airflow" installation complexity. > > > > If we do it this way- the part of code/clean split would be "delegated > > out" from AIP-92 to this new AIP and turned into dependency. > > > > J. > > > > > > On Thu, Aug 7, 2025 at 1:51 PM Ash Berlin-Taylor <a...@apache.org> wrote: > > > >> This AIP is definitely heading in the right direction and is a feature > >> I’d like to see. > >> > >> For me the outstanding things that need more detail: > >> > >> 1. The Authentication token. How will this long lived token work without > >> being insecure. Who and what will generate it? How will we identify > >> top-level requests for Variables in order to be able to add Variable > >> RBAC/ACLs. This is an important enough thing that I think it needs > >> discussion before we vote on this AIP. > >> 2. Security generally — how will this work, especially with the > >> multi-team? I think this likely means making the APIs work on the bundle > >> level as you mention in the doc, but I haven’t thought deeply about this > >> yet. > >> 3. API Versioning? One of the the key driving goals with AIP-72 and the > >> Task Execution SDK was the idea that “you can upgrade the API server as > you > >> like, and your clients/workers never need to work” — i.e. the API > server is > >> 100% working with all older versions of the TaskSDK. I don’t know if we > >> will achieve that goal in the long run but it is the desire, and part of > >> why we are using CalVer and the Cadwyn library to provide API > versioning. > >> 4. As mentioned previously, not sure the existing serialised JSON format > >> for DAGs is correct, but since that now has version and we already have > the > >> ability to upgrade that somewhere in the Airflow Core that doesn’t > >> necessarily become a blocker/pre-requisite for this AIP. > >> > >> I think Dag parsing API client+submission+parsing process manager should > >> either live in the Task SDK dist, or in a new separate dist that uses > >> TaskSDK, but crucially not in apache-airflow-core. My reason for this is > >> that I want it to be possible for the server components (scheduler, API > >> server) to not need task-sdk installed (just for cleanliness/avoiding > >> confusion about what versions it needs) and also vice-verse, to be able > to > >> run a “team worker bundle” (Dag parsing, workers, triggered/async > workers) > >> on whatever version of TaskSDK they choose, again without > >> apache-airflow-core installed for avoidance of doubt. > >> > >> Generally I would like this as it means we can have a nicer separation > of > >> Core and Dag parsing code, as the dag parsing itself uses the SDK, it > would > >> be nice to have a proper server/client split, both from a tighter > security > >> point-of-view, but also from a code layout point of view. > >> > >> -ash > >> > >> > >> > On 7 Aug 2025, at 12:36, Jarek Potiuk <ja...@potiuk.com> wrote: > >> > > >> > Well, you started it - so it's up to you to decide if you think we > have > >> > consensus, or whether we need a vote. > >> > > >> > And It's not a question of "informal" vote but it's rather clear > >> following > >> > the https://www.apache.org/foundation/voting.html that we either > need a > >> > LAZY CONSENSUS or VOTE thread. Both are formal. > >> > > >> > This is the difficult part when you have a proposal, to assess (by > you) > >> > whether we are converging to consensus or whether vote is needed. > There > >> is > >> > no other body or "authority" to do it for you. > >> > > >> > J. > >> > > >> > On Thu, Aug 7, 2025 at 1:02 PM Sumit Maheshwari < > sumeet.ma...@gmail.com > >> > > >> > wrote: > >> > > >> >> Sorry for nudging again, but can we get into some consensus on this? > I > >> mean > >> >> if this AIP isn't good enough, then we can drop it altogether and > >> someone > >> >> can rethink the whole thing. Should we do some kind of informal > voting > >> and > >> >> close this thread? > >> >> > >> >> On Mon, Aug 4, 2025 at 3:32 PM Jarek Potiuk <ja...@potiuk.com> > wrote: > >> >> > >> >>>>> My main concern with this right now is the serialisation format of > >> >> DAGs > >> >>> — > >> >>>> it wasn’t really designed with remote submission in mind, so it > need > >> >> some > >> >>>> careful examination to see if it is fit for this purpose or not. > >> >>> > >> >>> I understand Ash's concerns - the format has not been designed with > >> >>> size/speed optimization in mind so **possibly** we could design a > >> >> different > >> >>> format that would be better suited. > >> >>> > >> >>> BUT ... Done is better than perfect. > >> >>> > >> >>> I think there are a number of risks involved in changing the format > >> and > >> >> it > >> >>> could significantly increase time of development with uncertain > gains > >> at > >> >>> the end - also because of the progress in compression that happened > >> over > >> >>> the last few years. > >> >>> > >> >>> It might be a good idea to experiment a bit with different > compression > >> >>> algorithms for "our" dag representation and possibly we could find > the > >> >> best > >> >>> algorithm for "airflow dag" type of json data. There are a lot of > >> >>> repetitions in the JSON representation and I guess in "our" json > >> >>> representation there are some artifacts and repeated sections that > >> simply > >> >>> might compress well with different algorithms. Also in this case > >> >>> speed matters (and CPU trade-off). > >> >>> > >> >>> Looking at compression "theory" - before we experiment with it - > >> there is > >> >>> the relatively new standard "zstandard" > >> https://github.com/facebook/zstd > >> >>> compression opensourced in 2016 which I've heard good things about - > >> >>> especially that it maintains a very good compression rate for text > >> data, > >> >>> but also it is tremendously fast - especially for decompression > >> (which is > >> >>> super important factor for us - we compress new DAG representation > far > >> >> less > >> >>> often than decompress it in general case). It is standardized in RFC > >> >>> https://datatracker.ietf.org/doc/html/rfc8878 and there are various > >> >>> implementations and it is even being added to Python standard > library > >> in > >> >>> Python 3.14 > >> https://docs.python.org/3.14/library/compression.zstd.html > >> >> and > >> >>> there is a very well maintained python binding library > >> >>> https://pypi.org/project/zstd/ to Yann Collet (algorithm author) > >> ZSTD C > >> >>> library. And libzstd is already part of our images - it is needed by > >> >> other > >> >>> dependencies of ours. All with BSD licence, directly usable by us. > >> >>> > >> >>> I think this one might be a good candidate for us to try, and > possibly > >> >> with > >> >>> zstd we could achieve both size and CPU overhead that would be > >> comparable > >> >>> with any "new" format we could come up with - especially that we are > >> >>> talking merely about processing a huge blob between "storable" > >> >> (compressed) > >> >>> and "locally usable" state (Python dict). We could likely use a > >> streaming > >> >>> JSON library (say the one that is used in Pydantic internally > >> >>> https://github.com/pydantic/jiter - we already have it as part of > >> >>> Pydantic) > >> >>> to also save memory - we could stream decompressed stream into > jitter > >> so > >> >>> that both the json dict and string representation does not have to > be > >> >>> loaded fully in memory at the same time. There are likely lots of > >> >>> optimisations we could do - I mentioned possibly streaming the data > >> from > >> >>> API directly to DB (if this is possible - not sure) > >> >>> > >> >>> J. > >> >>> > >> >>> > >> >>> On Mon, Aug 4, 2025 at 9:10 AM Sumit Maheshwari < > >> sumeet.ma...@gmail.com> > >> >>> wrote: > >> >>> > >> >>>>> > >> >>>>> My main concern with this right now is the serialisation format of > >> >>> DAGs — > >> >>>>> it wasn’t really designed with remote submission in mind, so it > need > >> >>> some > >> >>>>> careful examination to see if it is fit for this purpose or not. > >> >>>>> > >> >>>> > >> >>>> I'm not sure on this point, cause if we are able to convert a DAG > >> into > >> >>>> JSON, then it has to be transferable over the internet. > >> >>>> > >> >>>> In particular One of the things I worry about is that the JSON can > >> get > >> >>> huge > >> >>>>> — I’ve seem this as large as 10-20Mb for some dags > >> >>>> > >> >>>> > >> >>>> Yeah, agree on this, thats why we can transfer compressed data > >> instead > >> >> of > >> >>>> real json. Of course, this won't guarantee that the payload will > >> always > >> >>> be > >> >>>> small enough, but we can't say that it'll definitely happen either. > >> >>>> > >> >>>> I also wonder if as part of this proposal we should move the > Callback > >> >>>>> requests off the dag parsers and on to the workers instead > >> >>>> > >> >>>> let's make such a "workfload" implementation stream that could > >> support > >> >>> both > >> >>>>> - Deadlines and DAG parsing logic > >> >>>> > >> >>>> > >> >>>> I don't have any strong opinion here, but it feels like it's gonna > >> blow > >> >>> up > >> >>>> the scope of the AIP too much. > >> >>>> > >> >>>> > >> >>>> On Fri, Aug 1, 2025 at 2:27 AM Jarek Potiuk <ja...@potiuk.com> > >> wrote: > >> >>>> > >> >>>>>> My main concern with this right now is the serialisation format > of > >> >>>> DAGs — > >> >>>>> it wasn’t really designed with remote submission in mind, so it > need > >> >>> some > >> >>>>> careful examination to see if it is fit for this purpose or not. > >> >>>>> > >> >>>>> Yep. That might be potentially a problem (or at least "need more > >> >>>> resources > >> >>>>> to run airflow") and that is where my "2x memory" came from if we > do > >> >> it > >> >>>> in > >> >>>>> a trivial way. Currently we a) keep the whole DAG in memory when > >> >>>>> serializing it b) submit it to database (also using essentially > some > >> >>> kind > >> >>>>> of API (implemented by the database client) - so we know the whole > >> >>> thing > >> >>>>> "might work" but indeed if you use a trivial implementation of > >> >>> submitting > >> >>>>> the whole json - it basically means that the whole json will have > to > >> >>> also > >> >>>>> be kept in the memory of API server. But we also compress it when > >> >>> needed > >> >>>> - > >> >>>>> I wonder what are the compression ratios we saw with those > 10-20MBs > >> >>> Dags > >> >>>> - > >> >>>>> if the problem is using strings where bool would suffice, > >> compression > >> >>>>> should generally help a lot. We could only ever send compressed > data > >> >>> over > >> >>>>> the API - there seems to be no need to send "plain JSON" data over > >> >> the > >> >>>> API > >> >>>>> or storing the plain JSON in the DB (of course that trades memory > >> for > >> >>>> CPU). > >> >>>>> > >> >>>>> I wonder if sqlalchemy 2 (and drivers for MySQL/Postgres) have > >> >> support > >> >>>> for > >> >>>>> any kind if binary data streaming - because that could help a lot > of > >> >> if > >> >>>> we > >> >>>>> could use streaming HTTP API and chunk and append the binary > chunks > >> >>> (when > >> >>>>> writing) - or read data in chunks ans stream them back via the > API. > >> >>> That > >> >>>>> could seriously decrease the amount of memory needed by the API > >> >> server > >> >>> to > >> >>>>> process such huge serialized dags. > >> >>>>> > >> >>>>> And yeah - I would also love the "execute task" to be implemented > >> >> here > >> >>> - > >> >>>>> but I am not sure if this should be part of the same effort or > maybe > >> >> a > >> >>>>> separate implementation? That sounds very loosely coupled with DB > >> >>>>> isolation. And it seems a common theme - I think that would also > >> make > >> >>> the > >> >>>>> sync Deadline alerts case that we discussed at the dev call > today. I > >> >>>> wonder > >> >>>>> if that should not be kind of parallel (let's make such a > >> "workfload" > >> >>>>> implementation stream that could support both - Deadlines and DAG > >> >>> parsing > >> >>>>> logic. We have already two "users" for it and I really love the > >> >> saying > >> >>>> "if > >> >>>>> you want to make something reusable - make it usable first" - > seems > >> >>> like > >> >>>>> we might have good opportunity to make such workload > implementation > >> >>>> "doubly > >> >>>>> used" from the beginning which would increase chances it will be > >> >>>>> "reusable" for other things as well :). > >> >>>>> > >> >>>>> J. > >> >>>>> > >> >>>>> > >> >>>>> On Thu, Jul 31, 2025 at 12:28 PM Ash Berlin-Taylor < > a...@apache.org> > >> >>>> wrote: > >> >>>>> > >> >>>>>> My main concern with this right now is the serialisation format > of > >> >>>> DAGs — > >> >>>>>> it wasn’t really designed with remote submission in mind, so it > >> >> need > >> >>>> some > >> >>>>>> careful examination to see if it is fit for this purpose or not. > >> >>>>>> > >> >>>>>> In particular One of the things I worry about is that the JSON > can > >> >>> get > >> >>>>>> huge — I’ve seem this as large as 10-20Mb for some dags(!!) > (which > >> >> is > >> >>>>>> likely due to things being included as text when a bool might > >> >>> suffice, > >> >>>>> for > >> >>>>>> example) But I don’t think “just submit the existing JSON over an > >> >>> API” > >> >>>>> is a > >> >>>>>> good idea. > >> >>>>>> > >> >>>>>> I also wonder if as part of this proposal we should move the > >> >> Callback > >> >>>>>> requests off the dag parsers and on to the workers instead — in > >> >>> AIP-72 > >> >>>> we > >> >>>>>> introduced the concept of a Workload, with the only one existing > >> >>> right > >> >>>>> now > >> >>>>>> is “ExecuteTask” > >> >>>>>> > >> >>>>> > >> >>>> > >> >>> > >> >> > >> > https://github.com/apache/airflow/blob/8e1201c7713d5c677fa6f6d48bbd4f6903505f61/airflow-core/src/airflow/executors/workloads.py#L87-L88 > >> >>>>>> — it might be time to finally move task and dag callbacks to the > >> >> same > >> >>>>> thing > >> >>>>>> and make dag parsers only responsible for, well, parsing. :) > >> >>>>>> > >> >>>>>> These are all solvable problems, and this will be a great feature > >> >> to > >> >>>>> have, > >> >>>>>> but we need to do some more thinking and planning first. > >> >>>>>> > >> >>>>>> -ash > >> >>>>>> > >> >>>>>>> On 31 Jul 2025, at 10:12, Sumit Maheshwari < > >> >> sumeet.ma...@gmail.com > >> >>>> > >> >>>>>> wrote: > >> >>>>>>> > >> >>>>>>> Gentle reminder for everyone to review the proposal. > >> >>>>>>> > >> >>>>>>> Updated link: > >> >>>>>>> > >> >>>>>> > >> >>>>> > >> >>>> > >> >>> > >> >> > >> > https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-92+Isolate+DAG+processor%2C+Callback+processor%2C+and+Triggerer+from+core+services > >> >>>>>>> > >> >>>>>>> On Tue, Jul 29, 2025 at 4:37 PM Sumit Maheshwari < > >> >>>>> sumeet.ma...@gmail.com > >> >>>>>>> > >> >>>>>>> wrote: > >> >>>>>>> > >> >>>>>>>> Thanks everyone for reviewing this AIP. As Jarek and others > >> >>>>> suggested, I > >> >>>>>>>> expanded the scope of this AIP and divided it into three > phases. > >> >>>> With > >> >>>>>> the > >> >>>>>>>> increased scope, the boundary line between this AIP and AIP-85 > >> >>> got a > >> >>>>>> little > >> >>>>>>>> thinner, but I believe these are still two different > >> >> enhancements > >> >>> to > >> >>>>>> make. > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> On Fri, Jul 25, 2025 at 10:51 PM Sumit Maheshwari < > >> >>>>>> sumeet.ma...@gmail.com> > >> >>>>>>>> wrote: > >> >>>>>>>> > >> >>>>>>>>> Yeah, overall it makes sense to include Triggers as well to be > >> >>> part > >> >>>>> of > >> >>>>>>>>> this AIP and phase out the implementation. Though I didn't > >> >>> exclude > >> >>>>>> Triggers > >> >>>>>>>>> because "Uber" doesn't need that, I just thought of keeping > the > >> >>>> scope > >> >>>>>> of > >> >>>>>>>>> development small and achieving them, just like it was done in > >> >>>>> Airlfow > >> >>>>>> 3 by > >> >>>>>>>>> secluding only workers and not DAG-processor & Triggers. > >> >>>>>>>>> > >> >>>>>>>>> But if you think Triggers should be part of this AIP itself, > >> >>> then I > >> >>>>> can > >> >>>>>>>>> do that and include Triggers as well in it. > >> >>>>>>>>> > >> >>>>>>>>> On Fri, Jul 25, 2025 at 7:34 PM Jarek Potiuk < > ja...@potiuk.com > >> >>> > >> >>>>> wrote: > >> >>>>>>>>> > >> >>>>>>>>>> I would very much prefer the architectural choices of this > AIP > >> >>> are > >> >>>>>> based > >> >>>>>>>>>> on > >> >>>>>>>>>> "general public" needs rather than "Uber needs" even if Uber > >> >>> will > >> >>>> be > >> >>>>>>>>>> implementing it - so from my point of view having Trigger > >> >>>> separation > >> >>>>>> as > >> >>>>>>>>>> part of it is quite important. > >> >>>>>>>>>> > >> >>>>>>>>>> But that's not even this. > >> >>>>>>>>>> > >> >>>>>>>>>> We've been discussing for example for Deadlines (being > >> >>> implemented > >> >>>>> by > >> >>>>>>>>>> Dennis and Ramit a possibility of short, notification-style > >> >>>>>> "deadlines" > >> >>>>>>>>>> to be send to triggerer for execution - this is well advanced > >> >>> now, > >> >>>>> and > >> >>>>>>>>>> whether you want it or not Dag-provided code might be > >> >> serialized > >> >>>> and > >> >>>>>> sent > >> >>>>>>>>>> to triggerer for execution. This is part of our "broader" > >> >>>>>> architectural > >> >>>>>>>>>> change where we treat "workers" and "triggerer" similarly as > a > >> >>>>> general > >> >>>>>>>>>> executors of "sync" and "async" tasks respectively. That's > >> >> where > >> >>>>>> Airflow > >> >>>>>>>>>> is > >> >>>>>>>>>> evolving towards - inevitably. > >> >>>>>>>>>> > >> >>>>>>>>>> But we can of course phase things in out for implementation - > >> >>> even > >> >>>>> if > >> >>>>>> AIP > >> >>>>>>>>>> should cover both, I think if the goal of the AIP and > preamble > >> >>> is > >> >>>>>> about > >> >>>>>>>>>> separating "user code" from "database" as the main reason, it > >> >>> also > >> >>>>>> means > >> >>>>>>>>>> Triggerer if you ask me (from design point of view at least). > >> >>>>>>>>>> > >> >>>>>>>>>> Again implementation can be phased and even different people > >> >> and > >> >>>>> teams > >> >>>>>>>>>> might work on those phases/pieces. > >> >>>>>>>>>> > >> >>>>>>>>>> J. > >> >>>>>>>>>> > >> >>>>>>>>>> On Fri, Jul 25, 2025 at 2:29 PM Sumit Maheshwari < > >> >>>>>> sumeet.ma...@gmail.com > >> >>>>>>>>>>> > >> >>>>>>>>>> wrote: > >> >>>>>>>>>> > >> >>>>>>>>>>>> > >> >>>>>>>>>>>>> #2. Yeah, we would need something similar for triggerers > as > >> >>>> well, > >> >>>>>>>>>> but > >> >>>>>>>>>>>> that > >> >>>>>>>>>>>> can be done as part of a different AIP > >> >>>>>>>>>>> > >> >>>>>>>>>>> > >> >>>>>>>>>>> You won't achieve your goal of "true" isolation of user code > >> >> if > >> >>>> you > >> >>>>>>>>>> don't > >> >>>>>>>>>>>> do triggerer. I think if the goal is to achieve it - it > >> >> should > >> >>>>> cover > >> >>>>>>>>>>> both. > >> >>>>>>>>>>> > >> >>>>>>>>>>> > >> >>>>>>>>>>> My bad, should've explained our architecture for triggers as > >> >>>> well, > >> >>>>>>>>>>> apologies. So here it is: > >> >>>>>>>>>>> > >> >>>>>>>>>>> > >> >>>>>>>>>>> - Triggers would be running on a centralized service, so > >> >> all > >> >>>> the > >> >>>>>>>>>> Trigger > >> >>>>>>>>>>> classes will be part of the platform team's repo and not > >> >> the > >> >>>>>>>>>> customer's > >> >>>>>>>>>>> repo > >> >>>>>>>>>>> - The triggers won't be able to use any libs other than std > >> >>>> ones, > >> >>>>>>>>>> which > >> >>>>>>>>>>> are being used in core Airflow (like requests, etc) > >> >>>>>>>>>>> - As we are the owners of the core Airflow repo, customers > >> >>> have > >> >>>>> to > >> >>>>>>>>>> get > >> >>>>>>>>>>> our approval to land any class in this path (unlike the > >> >> dags > >> >>>> repo > >> >>>>>>>>>> which > >> >>>>>>>>>>> they own) > >> >>>>>>>>>>> - When a customer's task defer, we would have an allowlist > >> >> on > >> >>>> our > >> >>>>>>>>>> side > >> >>>>>>>>>>> to check if we should do the async polling or not > >> >>>>>>>>>>> - If the Trigger class isn't part of our repo (allowlist), > >> >>> just > >> >>>>>>>>>> fail the > >> >>>>>>>>>>> task, as anyway we won't be having the code that they used > >> >> in > >> >>>> the > >> >>>>>>>>>>> trigger > >> >>>>>>>>>>> class > >> >>>>>>>>>>> - If any of these conditions aren't suitable for you (as a > >> >>>>>>>>>> customer), > >> >>>>>>>>>>> feel free to use sync tasks only > >> >>>>>>>>>>> > >> >>>>>>>>>>> > >> >>>>>>>>>>> But in general, I agree to make triggerer svc also > >> >> communicate > >> >>>> over > >> >>>>>>>>>> apis > >> >>>>>>>>>>> only. If that is done, then we can have instances of > >> >> triggerer > >> >>>> svc > >> >>>>>>>>>> running > >> >>>>>>>>>>> at customer's side as well, which can process any type of > >> >>> trigger > >> >>>>>>>>>> class. > >> >>>>>>>>>>> Though that's not a blocker for us at the moment, cause > >> >>> triggerer > >> >>>>> are > >> >>>>>>>>>>> mostly doing just polling using simple libs like requests. > >> >>>>>>>>>>> > >> >>>>>>>>>>> > >> >>>>>>>>>>> > >> >>>>>>>>>>> On Fri, Jul 25, 2025 at 5:03 PM Igor Kholopov > >> >>>>>>>>>> <ikholo...@google.com.invalid > >> >>>>>>>>>>>> > >> >>>>>>>>>>> wrote: > >> >>>>>>>>>>> > >> >>>>>>>>>>>> Thanks Sumit for the detailed proposal. Overall I believe > it > >> >>>>> aligns > >> >>>>>>>>>> well > >> >>>>>>>>>>>> with the goals of making Airflow well-scalable beyond a > >> >>>>> single-team > >> >>>>>>>>>>>> deployment (and AIP-85 goals), so you have my full support > >> >>> with > >> >>>>> this > >> >>>>>>>>>> one. > >> >>>>>>>>>>>> > >> >>>>>>>>>>>> I've left a couple of clarification requests on the AIP > >> >> page. > >> >>>>>>>>>>>> > >> >>>>>>>>>>>> Thanks, > >> >>>>>>>>>>>> Igor > >> >>>>>>>>>>>> > >> >>>>>>>>>>>> On Fri, Jul 25, 2025 at 11:50 AM Sumit Maheshwari < > >> >>>>>>>>>>> sumeet.ma...@gmail.com> > >> >>>>>>>>>>>> wrote: > >> >>>>>>>>>>>> > >> >>>>>>>>>>>>> Thanks Jarek and Ash, for the initial review. It's good to > >> >>> know > >> >>>>>>>>>> that > >> >>>>>>>>>>> the > >> >>>>>>>>>>>>> DAG processor has some preemptive measures in place to > >> >>> prevent > >> >>>>>>>>>> access > >> >>>>>>>>>>>>> to the DB. However, the main issue we are trying to solve > >> >> is > >> >>>> not > >> >>>>> to > >> >>>>>>>>>>>> provide > >> >>>>>>>>>>>>> DB creds to the customer teams, who are using Airflow as a > >> >>>>>>>>>> multi-tenant > >> >>>>>>>>>>>>> orchestration platform. I've updated the doc to reflect > >> >> this > >> >>>>> point > >> >>>>>>>>>> as > >> >>>>>>>>>>>> well. > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>> Answering Jarek's points, > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>> #1. Yeah, had forgot to write about token mechanism, added > >> >>> that > >> >>>>> in > >> >>>>>>>>>> doc, > >> >>>>>>>>>>>> but > >> >>>>>>>>>>>>> still how the token can be obtained (safely) is still open > >> >> in > >> >>>> my > >> >>>>>>>>>> mind. > >> >>>>>>>>>>> I > >> >>>>>>>>>>>>> believe the token used by task executors can be created > >> >>> outside > >> >>>>> of > >> >>>>>>>>>> it > >> >>>>>>>>>>> as > >> >>>>>>>>>>>>> well (I may be wrong here). > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>> #2. Yeah, we would need something similar for triggerers > as > >> >>>> well, > >> >>>>>>>>>> but > >> >>>>>>>>>>>> that > >> >>>>>>>>>>>>> can be done as part of a different AIP > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>> #3. Yeah, I also believe the API should work largely. > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>> #4. Added that in the AIP, that instead of dag_dirs we can > >> >>> work > >> >>>>>>>>>> with > >> >>>>>>>>>>>>> dag_bundles and every dag-processor instance would be > >> >> treated > >> >>>> as > >> >>>>> a > >> >>>>>>>>>> diff > >> >>>>>>>>>>>>> bundle. > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>> Also, added points around callbacks, as these are also > >> >>> fetched > >> >>>>>>>>>> directly > >> >>>>>>>>>>>>> from the DB. > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>> On Fri, Jul 25, 2025 at 11:58 AM Jarek Potiuk < > >> >>>> ja...@potiuk.com> > >> >>>>>>>>>>> wrote: > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>>>>> A clarification to this - the dag parser today is likely > >> >>> not > >> >>>>>>>>>>>> protection > >> >>>>>>>>>>>>>> against a dedicated malicious DAG author, but it does > >> >>> protect > >> >>>>>>>>>> against > >> >>>>>>>>>>>>>> casual DB access attempts - the db session is blanked out > >> >> in > >> >>>> the > >> >>>>>>>>>>>> parsing > >> >>>>>>>>>>>>>> process , as are the env var configs > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>> > >> >>>>>>>>>>> > >> >>>>>>>>>> > >> >>>>>> > >> >>>>> > >> >>>> > >> >>> > >> >> > >> > https://github.com/apache/airflow/blob/main/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L274-L316 > >> >>>>>>>>>>>>>> - > >> >>>>>>>>>>>>>> is this perfect no? but it’s much more than no protection > >> >>>>>>>>>>>>>> Oh absolutely.. This is exactly what we discussed back > >> >> then > >> >>> in > >> >>>>>>>>>> March > >> >>>>>>>>>>> I > >> >>>>>>>>>>>>>> think - and the way we decided to go for 3.0 with full > >> >>>> knowledge > >> >>>>>>>>>> it's > >> >>>>>>>>>>>> not > >> >>>>>>>>>>>>>> protecting against all threats. > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> On Fri, Jul 25, 2025 at 8:22 AM Ash Berlin-Taylor < > >> >>>>>>>>>> a...@apache.org> > >> >>>>>>>>>>>>> wrote: > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>>>> A clarification to this - the dag parser today is likely > >> >>> not > >> >>>>>>>>>>>> protection > >> >>>>>>>>>>>>>>> against a dedicated malicious DAG author, but it does > >> >>> protect > >> >>>>>>>>>>> against > >> >>>>>>>>>>>>>>> casual DB access attempts - the db session is blanked > out > >> >>> in > >> >>>>>>>>>> the > >> >>>>>>>>>>>>> parsing > >> >>>>>>>>>>>>>>> process , as are the env var configs > >> >>>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>> > >> >>>>>>>>>>> > >> >>>>>>>>>> > >> >>>>>> > >> >>>>> > >> >>>> > >> >>> > >> >> > >> > https://github.com/apache/airflow/blob/main/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L274-L316 > >> >>>>>>>>>>>>>>> - is this perfect no? but it’s much more than no > >> >> protection > >> >>>>>>>>>>>>>>> > >> >>>>>>>>>>>>>>>> On 24 Jul 2025, at 21:56, Jarek Potiuk < > >> >> ja...@potiuk.com> > >> >>>>>>>>>> wrote: > >> >>>>>>>>>>>>>>>> > >> >>>>>>>>>>>>>>>> Currently in the DagFile processor there is no > built-in > >> >>>>>>>>>>> protection > >> >>>>>>>>>>>>>>> against > >> >>>>>>>>>>>>>>>> user code from Dag Parsing to - for example - read > >> >>> database > >> >>>>>>>>>>>>>>>> credentials from airflow configuration and use them to > >> >>> talk > >> >>>>>>>>>> to DB > >> >>>>>>>>>>>>>>> directly. > >> >>>>>>>>>>>>>>> > >> >>>>>>>>>>>>>> > >> >>>>>>>>>>>>> > >> >>>>>>>>>>>> > >> >>>>>>>>>>> > >> >>>>>>>>>> > >> >>>>>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>> > >> >>>> > >> >>> > >> >> > >> > >> >