This is quite an interesting read Jeongwoo, I don't yet have a strong opinion on this except it's worth checking out. I'll reread through a couple times and hopefully come up with some thoughts, but your investigation so far looks quite interesting.
-- Regards, Aritra Basu On Thu, 18 Dec 2025, 2:08 pm Jeongwoo Do, <[email protected]> wrote: > Hello Airflow community, > > While working on resolving a memory leak issue in the LocalExecutor[1], I > observed that garbage collection (GC) in forked subprocesses was triggering > Copy-On-Write (COW) on shared memory, which significantly increased each > process’s PSS. By using gc.freeze to move objects created at subprocess > startup into the GC permanent generation, I was able to mitigate this issue > effectively. > > > > I would like to propose applying the same approach to the Dag processor to > address GC-related performance issues and improve stability in > subprocesses. Below are the expected benefits. > > Preventing COW on Shared Memory > Unlike the LocalExecutor, where subprocesses are long-lived, Dag processor > subprocesses are not permanent. However, with the increasing adoption of > dynamic Dags, parsing time has become longer in many cases. GC activity > during parsing can trigger COW on shared memory, leading to memory spikes. > In containerized environments, these spikes can result in OOM events. > > Improving GC Performance > Applying gc.freeze marks existing objects as non-GC targets. As a result, > this greatly lowers the frequency of threshold-based GC runs and makes GC > much faster when it does occur. In a simple experiment, I observed GC time > dropping from roughly 1 second to about 1 microsecond (with GC forced via > gc.collect). > > Eliminating GC-Related Issues in Child Processes > Similar to the issue in [2], GC triggered arbitrarily in child processes > can affect shared objects inherited from the parent. By ensuring that > parent-owned objects are not subject to GC in children, these issues can be > avoided entirely. > > > > Beyond immediate performance and stability improvements, increased memory > stability also enables further optimizations. For example, preloading heavy > modules in the parent process can eliminate repeated memory loading in each > child process. This approach has been discussed previously in [3], and > preloading Airflow modules is already partially implemented today. > > While [3] primarily focused on parsing time, the broader benefit is > reduced CPU and memory usage overall. Extending this idea beyond Airflow > modules, allowing users to pre-import libraries used in DAG files could > provide significant performance gains. > > That said, it is also clear why this has not been broadly adopted so far. > Persistently importing problematic libraries defined in DAG files could > introduce side effects, and unloading modules once loaded is difficult. In > environments with frequent DAG changes, this can become a burden. > > For this reason, I believe the optimal approach is to allow pre-importing > only for explicitly user-approved libraries. Users would define which > libraries to preload via configuration. These libraries would be loaded > lazily, and only after they are successfully loaded in a child process > would they be loaded in the parent process as well. The pre-import > mechanism I proposed recently in [4] may be helpful here. > > > > In summary, I am proposing two items: > > 1. Apply gc.freeze to the DAG processor. > > 2.Then, allow user-aware and intentional preloading of libraries. > > Thank you for taking the time to read this. If this proposal requires an > AIP, I would be happy to prepare one. > > [1] https://github.com/apache/airflow/pull/58365 > [2] https://github.com/apache/airflow/issues/56879 > [3] https://github.com/apache/airflow/pull/30495 > [4] https://github.com/apache/airflow/pull/58890 > > Best Regards, Jeongwoo Do > > > > >
