In general - yes. That's something we discussed as a next step and "final" step of true separation of "user code" from "airflow code".
Currently in the DagFile processor there is no built-in protection against user code from Dag Parsing to - for example - read database credentials from airflow configuration and use them to talk to DB directly. I raised the same questions at the dev calls some time ago when we discussed task isolation - and from the discussion on one of the dev call this was what we agreed we should have eventually and that was a conscious choice to (for now) make sure that DB is not actually used in the parsing process - and parsed serialized dags are passed still through multiprocessing to a process that has DB access. The deliberate choice was that 3.0 was only isolating "workers" and nothing else. This will also make multi-team setup more isolated as well. And especially if you would like to take it on, that would be cool :) I would love to first hear feedback on the general concept without diving in detail comments but my first few - very crude - comments are: 1) there should be a way to authenticate the Dag processor to the API server (long living auth information). Currently Dag Processor uses DB credentials to write to the DB (which is not isolated by definition) - but when we open it as an API we should not allow "anyone" to post new serialized dags. so there should be a mechanism where we authenticate "dag processor". Don't have concrete proposal yet - but we need to figure it out 2) I am afraid - Dag processor is not enough. we also need to make a very similar split in Triggerer. Triggerer is another component that has access to both - user-code and database. This means "/triggerer" namespace as well. This api should be used to update/retrieve serialized triggers in Airflow DB. 3) I don't think performance is a "huge" issue. For POST - Currently DAG processor performs UPDATE on serialized dag table with the same payload. The worst case is that we need more api-servers to handle it because instead of dag -> DB, the payload will be sent via api_server. It could even be a dedicated api_server if needs be. Yep. It will likely require a bit more resources - but I think it's just max 2x dag processor memory in the worst case, which is a "fixed" increase (and likely we could leave the option of **not** using API server in simple installation - more resources will be needed only when "more isolation" is needed. I don't think we will have problem with pagination of GET - there is absolutely no need to retrieve full serialized dag in the GET method - we only need meta-data, version information, hash of the serialized dag - but not the serialized dag itself. That should be rather small especially if point 4). 4) the API should have "dag_bundle" or a set of dag_bundles as primary "selection" criteria. That would also play very well with the multi-team where you should have a separate dag processor for a group of dag_bundles belonging to one team. This also means that the authentication in point 1) should be covering access to specific dag bundles only - for example I imagine that we could generate long-living private credentials containing a signed list of dag bundles that the dag processor should be allowed to interact with. J. On Thu, Jul 24, 2025 at 7:30 PM Sumit Maheshwari <sumeet.ma...@gmail.com> wrote: > Hello everyone, > > I have created a draft version to separate the running of DAG processor > from Airflow's core services and moved it closer to the execution side. > > Please review the proposal and provide your valuable feedback. > > PS: We are in the process of adopting Airflow3 from some in-house setup, so > I might not have a full understanding of the latest Airflow concepts and > some other nitty-gritties. > > Thanks, > Sumit >