My main concern with this right now is the serialisation format of DAGs — it wasn’t really designed with remote submission in mind, so it need some careful examination to see if it is fit for this purpose or not.
In particular One of the things I worry about is that the JSON can get huge — I’ve seem this as large as 10-20Mb for some dags(!!) (which is likely due to things being included as text when a bool might suffice, for example) But I don’t think “just submit the existing JSON over an API” is a good idea. I also wonder if as part of this proposal we should move the Callback requests off the dag parsers and on to the workers instead — in AIP-72 we introduced the concept of a Workload, with the only one existing right now is “ExecuteTask” https://github.com/apache/airflow/blob/8e1201c7713d5c677fa6f6d48bbd4f6903505f61/airflow-core/src/airflow/executors/workloads.py#L87-L88 — it might be time to finally move task and dag callbacks to the same thing and make dag parsers only responsible for, well, parsing. :) These are all solvable problems, and this will be a great feature to have, but we need to do some more thinking and planning first. -ash > On 31 Jul 2025, at 10:12, Sumit Maheshwari <sumeet.ma...@gmail.com> wrote: > > Gentle reminder for everyone to review the proposal. > > Updated link: > https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-92+Isolate+DAG+processor%2C+Callback+processor%2C+and+Triggerer+from+core+services > > On Tue, Jul 29, 2025 at 4:37 PM Sumit Maheshwari <sumeet.ma...@gmail.com> > wrote: > >> Thanks everyone for reviewing this AIP. As Jarek and others suggested, I >> expanded the scope of this AIP and divided it into three phases. With the >> increased scope, the boundary line between this AIP and AIP-85 got a little >> thinner, but I believe these are still two different enhancements to make. >> >> >> >> On Fri, Jul 25, 2025 at 10:51 PM Sumit Maheshwari <sumeet.ma...@gmail.com> >> wrote: >> >>> Yeah, overall it makes sense to include Triggers as well to be part of >>> this AIP and phase out the implementation. Though I didn't exclude Triggers >>> because "Uber" doesn't need that, I just thought of keeping the scope of >>> development small and achieving them, just like it was done in Airlfow 3 by >>> secluding only workers and not DAG-processor & Triggers. >>> >>> But if you think Triggers should be part of this AIP itself, then I can >>> do that and include Triggers as well in it. >>> >>> On Fri, Jul 25, 2025 at 7:34 PM Jarek Potiuk <ja...@potiuk.com> wrote: >>> >>>> I would very much prefer the architectural choices of this AIP are based >>>> on >>>> "general public" needs rather than "Uber needs" even if Uber will be >>>> implementing it - so from my point of view having Trigger separation as >>>> part of it is quite important. >>>> >>>> But that's not even this. >>>> >>>> We've been discussing for example for Deadlines (being implemented by >>>> Dennis and Ramit a possibility of short, notification-style "deadlines" >>>> to be send to triggerer for execution - this is well advanced now, and >>>> whether you want it or not Dag-provided code might be serialized and sent >>>> to triggerer for execution. This is part of our "broader" architectural >>>> change where we treat "workers" and "triggerer" similarly as a general >>>> executors of "sync" and "async" tasks respectively. That's where Airflow >>>> is >>>> evolving towards - inevitably. >>>> >>>> But we can of course phase things in out for implementation - even if AIP >>>> should cover both, I think if the goal of the AIP and preamble is about >>>> separating "user code" from "database" as the main reason, it also means >>>> Triggerer if you ask me (from design point of view at least). >>>> >>>> Again implementation can be phased and even different people and teams >>>> might work on those phases/pieces. >>>> >>>> J. >>>> >>>> On Fri, Jul 25, 2025 at 2:29 PM Sumit Maheshwari <sumeet.ma...@gmail.com >>>>> >>>> wrote: >>>> >>>>>> >>>>>>> #2. Yeah, we would need something similar for triggerers as well, >>>> but >>>>>> that >>>>>> can be done as part of a different AIP >>>>> >>>>> >>>>> You won't achieve your goal of "true" isolation of user code if you >>>> don't >>>>>> do triggerer. I think if the goal is to achieve it - it should cover >>>>> both. >>>>> >>>>> >>>>> My bad, should've explained our architecture for triggers as well, >>>>> apologies. So here it is: >>>>> >>>>> >>>>> - Triggers would be running on a centralized service, so all the >>>> Trigger >>>>> classes will be part of the platform team's repo and not the >>>> customer's >>>>> repo >>>>> - The triggers won't be able to use any libs other than std ones, >>>> which >>>>> are being used in core Airflow (like requests, etc) >>>>> - As we are the owners of the core Airflow repo, customers have to >>>> get >>>>> our approval to land any class in this path (unlike the dags repo >>>> which >>>>> they own) >>>>> - When a customer's task defer, we would have an allowlist on our >>>> side >>>>> to check if we should do the async polling or not >>>>> - If the Trigger class isn't part of our repo (allowlist), just >>>> fail the >>>>> task, as anyway we won't be having the code that they used in the >>>>> trigger >>>>> class >>>>> - If any of these conditions aren't suitable for you (as a >>>> customer), >>>>> feel free to use sync tasks only >>>>> >>>>> >>>>> But in general, I agree to make triggerer svc also communicate over >>>> apis >>>>> only. If that is done, then we can have instances of triggerer svc >>>> running >>>>> at customer's side as well, which can process any type of trigger >>>> class. >>>>> Though that's not a blocker for us at the moment, cause triggerer are >>>>> mostly doing just polling using simple libs like requests. >>>>> >>>>> >>>>> >>>>> On Fri, Jul 25, 2025 at 5:03 PM Igor Kholopov >>>> <ikholo...@google.com.invalid >>>>>> >>>>> wrote: >>>>> >>>>>> Thanks Sumit for the detailed proposal. Overall I believe it aligns >>>> well >>>>>> with the goals of making Airflow well-scalable beyond a single-team >>>>>> deployment (and AIP-85 goals), so you have my full support with this >>>> one. >>>>>> >>>>>> I've left a couple of clarification requests on the AIP page. >>>>>> >>>>>> Thanks, >>>>>> Igor >>>>>> >>>>>> On Fri, Jul 25, 2025 at 11:50 AM Sumit Maheshwari < >>>>> sumeet.ma...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Thanks Jarek and Ash, for the initial review. It's good to know >>>> that >>>>> the >>>>>>> DAG processor has some preemptive measures in place to prevent >>>> access >>>>>>> to the DB. However, the main issue we are trying to solve is not to >>>>>> provide >>>>>>> DB creds to the customer teams, who are using Airflow as a >>>> multi-tenant >>>>>>> orchestration platform. I've updated the doc to reflect this point >>>> as >>>>>> well. >>>>>>> >>>>>>> Answering Jarek's points, >>>>>>> >>>>>>> #1. Yeah, had forgot to write about token mechanism, added that in >>>> doc, >>>>>> but >>>>>>> still how the token can be obtained (safely) is still open in my >>>> mind. >>>>> I >>>>>>> believe the token used by task executors can be created outside of >>>> it >>>>> as >>>>>>> well (I may be wrong here). >>>>>>> >>>>>>> #2. Yeah, we would need something similar for triggerers as well, >>>> but >>>>>> that >>>>>>> can be done as part of a different AIP >>>>>>> >>>>>>> #3. Yeah, I also believe the API should work largely. >>>>>>> >>>>>>> #4. Added that in the AIP, that instead of dag_dirs we can work >>>> with >>>>>>> dag_bundles and every dag-processor instance would be treated as a >>>> diff >>>>>>> bundle. >>>>>>> >>>>>>> Also, added points around callbacks, as these are also fetched >>>> directly >>>>>>> from the DB. >>>>>>> >>>>>>> On Fri, Jul 25, 2025 at 11:58 AM Jarek Potiuk <ja...@potiuk.com> >>>>> wrote: >>>>>>> >>>>>>>>> A clarification to this - the dag parser today is likely not >>>>>> protection >>>>>>>> against a dedicated malicious DAG author, but it does protect >>>> against >>>>>>>> casual DB access attempts - the db session is blanked out in the >>>>>> parsing >>>>>>>> process , as are the env var configs >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> https://github.com/apache/airflow/blob/main/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L274-L316 >>>>>>>> - >>>>>>>> is this perfect no? but it’s much more than no protection >>>>>>>> Oh absolutely.. This is exactly what we discussed back then in >>>> March >>>>> I >>>>>>>> think - and the way we decided to go for 3.0 with full knowledge >>>> it's >>>>>> not >>>>>>>> protecting against all threats. >>>>>>>> >>>>>>>> On Fri, Jul 25, 2025 at 8:22 AM Ash Berlin-Taylor < >>>> a...@apache.org> >>>>>>> wrote: >>>>>>>> >>>>>>>>> A clarification to this - the dag parser today is likely not >>>>>> protection >>>>>>>>> against a dedicated malicious DAG author, but it does protect >>>>> against >>>>>>>>> casual DB access attempts - the db session is blanked out in >>>> the >>>>>>> parsing >>>>>>>>> process , as are the env var configs >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> https://github.com/apache/airflow/blob/main/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L274-L316 >>>>>>>>> - is this perfect no? but it’s much more than no protection >>>>>>>>> >>>>>>>>>> On 24 Jul 2025, at 21:56, Jarek Potiuk <ja...@potiuk.com> >>>> wrote: >>>>>>>>>> >>>>>>>>>> Currently in the DagFile processor there is no built-in >>>>> protection >>>>>>>>> against >>>>>>>>>> user code from Dag Parsing to - for example - read database >>>>>>>>>> credentials from airflow configuration and use them to talk >>>> to DB >>>>>>>>> directly. >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>