+ 1 on this as well. From what I have seen, standalone DAG processing results in a minor performance advantage and, importantly, makes the Scheduler loop more resilient to DAG processor crashes.
Shubham On 2025-01-09, 4:02 PM, "Daniel Imberman" <daniel.imber...@gmail.com <mailto:daniel.imber...@gmail.com>> wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le contenu ne présente aucun risque. I'm +1 on this. The fact that there's one more thing to deploy isn't that big of an issue given the number of pre-configurable options mentioned (e.g. helm) and a full logical separation of DAG parsing and scheduling makes sense (one thing that has been a longstanding issue with Airflow is the scheduler "Doing too many things", so it would be nice to create a clean divide here). On Thu, Jan 9, 2025 at 3:28 PM Jed Cunningham <j...@astronomer.io.inva <mailto:j...@astronomer.io.inva>lid> wrote: > Hello everyone! > > As I've been working on parsing lately, I want to propose a change in that > area in time for Airflow 3. > > Today there are 2 different ways the DAG processor can be run in Airflow - > as a standalone component, or embedded in the scheduler. The standalone > option came in 2.3, prior to that the only option was it being embedded in > the scheduler. > > Why standalone? Generally speaking, parsing scales vertically (single loop > - more concurrent parsing) while scheduling is scaled horizontally (many > loops). As the DAG processor and scheduler scale in different manners, it's > awkward to have them live in the same component. There is also a resiliency > aspect here, no noisy neighbor issues. > > Really, the only positive of the embedded option is that it's easier to > deploy, as there is 1 less component to worry about. However, we already > have a number of components, so 1 more isn't that cumbersome. Everyone > using breeze, standalone, the helm chart, a vendor, won't be impacted much > by this change - in fact, having the log stream separate is a big positive! > > We'd also be able to remove a bit of complexity around reinitialising a > bunch of stuff in the child process. > > Overall, I see primarily positives with this change, and a major version > upgrade is the perfect time to simplify this part of Airflow. Thoughts? > > Jed >