Re: [DISCUSS] AIP-92 Isolate DAG parsing logic

Ash Berlin-Taylor Thu, 31 Jul 2025 03:28:18 -0700

My main concern with this right now is the serialisation format of DAGs — it 
wasn’t really designed with remote submission in mind, so it need some careful 
examination to see if it is fit for this purpose or not.


In particular One of the things I worry about is that the JSON can get huge — 
I’ve seem this as large as 10-20Mb for some dags(!!) (which is likely due to 
things being included as text when a bool might suffice, for example) But I 
don’t think “just submit the existing JSON over an API” is a good idea.

I also wonder if as part of this proposal we should move the Callback requests 
off the dag parsers and on to the workers instead — in AIP-72 we introduced the 
concept of a Workload, with the only one existing right now is “ExecuteTask” 
https://github.com/apache/airflow/blob/8e1201c7713d5c677fa6f6d48bbd4f6903505f61/airflow-core/src/airflow/executors/workloads.py#L87-L88
 — it might be time to finally move task and dag callbacks to the same thing 
and make dag parsers only responsible for, well, parsing. :) 

These are all solvable problems, and this will be a great feature to have, but 
we need to do some more thinking and planning first.

-ash

> On 31 Jul 2025, at 10:12, Sumit Maheshwari <sumeet.ma...@gmail.com> wrote:
> 
> Gentle reminder for everyone to review the proposal.
> 
> Updated link:
> https://cwiki.apache.org/confluence/display/AIRFLOW/%5BWIP%5D+AIP-92+Isolate+DAG+processor%2C+Callback+processor%2C+and+Triggerer+from+core+services
> 
> On Tue, Jul 29, 2025 at 4:37 PM Sumit Maheshwari <sumeet.ma...@gmail.com>
> wrote:
> 
>> Thanks everyone for reviewing this AIP. As Jarek and others suggested, I
>> expanded the scope of this AIP and divided it into three phases. With the
>> increased scope, the boundary line between this AIP and AIP-85 got a little
>> thinner, but I believe these are still two different enhancements to make.
>> 
>> 
>> 
>> On Fri, Jul 25, 2025 at 10:51 PM Sumit Maheshwari <sumeet.ma...@gmail.com>
>> wrote:
>> 
>>> Yeah, overall it makes sense to include Triggers as well to be part of
>>> this AIP and phase out the implementation. Though I didn't exclude Triggers
>>> because "Uber" doesn't need that, I just thought of keeping the scope of
>>> development small and achieving them, just like it was done in Airlfow 3 by
>>> secluding only workers and not DAG-processor & Triggers.
>>> 
>>> But if you think Triggers should be part of this AIP itself, then I can
>>> do that and include Triggers as well in it.
>>> 
>>> On Fri, Jul 25, 2025 at 7:34 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>>> 
>>>> I would very much prefer the architectural choices of this AIP are based
>>>> on
>>>> "general public" needs rather than "Uber needs" even if Uber will be
>>>> implementing it - so from my point of view having Trigger separation as
>>>> part of it is quite important.
>>>> 
>>>> But that's not even this.
>>>> 
>>>> We've been discussing for example for Deadlines (being implemented by
>>>> Dennis and Ramit   a possibility of short, notification-style "deadlines"
>>>> to be send to triggerer for execution - this is well advanced now, and
>>>> whether you want it or not Dag-provided code might be serialized and sent
>>>> to triggerer for execution. This is part of our "broader" architectural
>>>> change where we treat "workers" and "triggerer" similarly as a general
>>>> executors of "sync" and "async" tasks respectively. That's where Airflow
>>>> is
>>>> evolving towards - inevitably.
>>>> 
>>>> But we can of course phase things in out for implementation - even if AIP
>>>> should cover both, I think if the goal of the AIP and preamble is about
>>>> separating "user code" from "database" as the main reason, it also means
>>>> Triggerer if you ask me (from design point of view at least).
>>>> 
>>>> Again implementation can be phased and even different people and teams
>>>> might work on those phases/pieces.
>>>> 
>>>> J.
>>>> 
>>>> On Fri, Jul 25, 2025 at 2:29 PM Sumit Maheshwari <sumeet.ma...@gmail.com
>>>>> 
>>>> wrote:
>>>> 
>>>>>> 
>>>>>>> #2. Yeah, we would need something similar for triggerers as well,
>>>> but
>>>>>> that
>>>>>> can be done as part of a different AIP
>>>>> 
>>>>> 
>>>>> You won't achieve your goal of "true" isolation of user code if you
>>>> don't
>>>>>> do triggerer. I think if the goal is to achieve it - it should cover
>>>>> both.
>>>>> 
>>>>> 
>>>>> My bad, should've explained our architecture for triggers as well,
>>>>> apologies. So here it is:
>>>>> 
>>>>> 
>>>>>   - Triggers would be running on a centralized service, so all the
>>>> Trigger
>>>>>   classes will be part of the platform team's repo and not the
>>>> customer's
>>>>> repo
>>>>>   - The triggers won't be able to use any libs other than std ones,
>>>> which
>>>>>   are being used in core Airflow (like requests, etc)
>>>>>   - As we are the owners of the core Airflow repo, customers have to
>>>> get
>>>>>   our approval to land any class in this path (unlike the dags repo
>>>> which
>>>>>   they own)
>>>>>   - When a customer's task defer, we would have an allowlist on our
>>>> side
>>>>>   to check if we should do the async polling or not
>>>>>   - If the Trigger class isn't part of our repo (allowlist), just
>>>> fail the
>>>>>   task, as anyway we won't be having the code that they used in the
>>>>> trigger
>>>>>   class
>>>>>   - If any of these conditions aren't suitable for you (as a
>>>> customer),
>>>>>   feel free to use sync tasks only
>>>>> 
>>>>> 
>>>>> But in general, I agree to make triggerer svc also communicate over
>>>> apis
>>>>> only. If that is done, then we can have instances of triggerer svc
>>>> running
>>>>> at customer's side as well, which can process any type of trigger
>>>> class.
>>>>> Though that's not a blocker for us at the moment, cause triggerer are
>>>>> mostly doing just polling using simple libs like requests.
>>>>> 
>>>>> 
>>>>> 
>>>>> On Fri, Jul 25, 2025 at 5:03 PM Igor Kholopov
>>>> <ikholo...@google.com.invalid
>>>>>> 
>>>>> wrote:
>>>>> 
>>>>>> Thanks Sumit for the detailed proposal. Overall I believe it aligns
>>>> well
>>>>>> with the goals of making Airflow well-scalable beyond a single-team
>>>>>> deployment (and AIP-85 goals), so you have my full support with this
>>>> one.
>>>>>> 
>>>>>> I've left a couple of clarification requests on the AIP page.
>>>>>> 
>>>>>> Thanks,
>>>>>> Igor
>>>>>> 
>>>>>> On Fri, Jul 25, 2025 at 11:50 AM Sumit Maheshwari <
>>>>> sumeet.ma...@gmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Thanks Jarek and Ash, for the initial review. It's good to know
>>>> that
>>>>> the
>>>>>>> DAG processor has some preemptive measures in place to prevent
>>>> access
>>>>>>> to the DB. However, the main issue we are trying to solve is not to
>>>>>> provide
>>>>>>> DB creds to the customer teams, who are using Airflow as a
>>>> multi-tenant
>>>>>>> orchestration platform. I've updated the doc to reflect this point
>>>> as
>>>>>> well.
>>>>>>> 
>>>>>>> Answering Jarek's points,
>>>>>>> 
>>>>>>> #1. Yeah, had forgot to write about token mechanism, added that in
>>>> doc,
>>>>>> but
>>>>>>> still how the token can be obtained (safely) is still open in my
>>>> mind.
>>>>> I
>>>>>>> believe the token used by task executors can be created outside of
>>>> it
>>>>> as
>>>>>>> well (I may be wrong here).
>>>>>>> 
>>>>>>> #2. Yeah, we would need something similar for triggerers as well,
>>>> but
>>>>>> that
>>>>>>> can be done as part of a different AIP
>>>>>>> 
>>>>>>> #3. Yeah, I also believe the API should work largely.
>>>>>>> 
>>>>>>> #4. Added that in the AIP, that instead of dag_dirs we can work
>>>> with
>>>>>>> dag_bundles and every dag-processor instance would be treated as a
>>>> diff
>>>>>>> bundle.
>>>>>>> 
>>>>>>> Also, added points around callbacks, as these are also fetched
>>>> directly
>>>>>>> from the DB.
>>>>>>> 
>>>>>>> On Fri, Jul 25, 2025 at 11:58 AM Jarek Potiuk <ja...@potiuk.com>
>>>>> wrote:
>>>>>>> 
>>>>>>>>> A clarification to this - the dag parser today is likely not
>>>>>> protection
>>>>>>>> against a dedicated malicious DAG author, but it does protect
>>>> against
>>>>>>>> casual DB access attempts - the db session is blanked out in the
>>>>>> parsing
>>>>>>>> process , as are the env var configs
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> https://github.com/apache/airflow/blob/main/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L274-L316
>>>>>>>> -
>>>>>>>> is this perfect no? but it’s much more than no protection
>>>>>>>> Oh absolutely.. This is exactly what we discussed back then in
>>>> March
>>>>> I
>>>>>>>> think - and the way we decided to go for 3.0 with full knowledge
>>>> it's
>>>>>> not
>>>>>>>> protecting against all threats.
>>>>>>>> 
>>>>>>>> On Fri, Jul 25, 2025 at 8:22 AM Ash Berlin-Taylor <
>>>> a...@apache.org>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> A clarification to this - the dag parser today is likely not
>>>>>> protection
>>>>>>>>> against a dedicated malicious DAG author, but it does protect
>>>>> against
>>>>>>>>> casual DB access attempts - the db session is blanked out in
>>>> the
>>>>>>> parsing
>>>>>>>>> process , as are the env var configs
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> https://github.com/apache/airflow/blob/main/task-sdk/src/airflow/sdk/execution_time/supervisor.py#L274-L316
>>>>>>>>> - is this perfect no? but it’s much more than no protection
>>>>>>>>> 
>>>>>>>>>> On 24 Jul 2025, at 21:56, Jarek Potiuk <ja...@potiuk.com>
>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Currently in the DagFile processor there is no  built-in
>>>>> protection
>>>>>>>>> against
>>>>>>>>>> user code from Dag Parsing to - for example - read database
>>>>>>>>>> credentials from airflow configuration and use them to talk
>>>> to DB
>>>>>>>>> directly.
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>

Re: [DISCUSS] AIP-92 Isolate DAG parsing logic

Reply via email to