This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/airflow.git


The following commit(s) were added to refs/heads/main by this push:
     new bcda5080b8f Do not remove .pyc and .pyo files after building Python 
(#58944)
bcda5080b8f is described below

commit bcda5080b8f3b43c63f2b6e5abd788efe5d58582
Author: Jarek Potiuk <[email protected]>
AuthorDate: Tue Dec 2 16:04:52 2025 +0100

    Do not remove .pyc and .pyo files after building Python (#58944)
    
    With .pyc files removal after compilation we save very little
    space. Uncompressed sizes of regular airflow image are:
    
    Before  7.63GB
    After   7.66GB
    
    So we have images bigger by < 0.5%
    
    And it seems that long running containers without those files can
    suffer from continuous attempts to recreate the .pyc files that
    fail due to lack of permissions and cause negative dentries to
    be continuously created:
    
    https://lwn.net/Articles/814535/
    
    Those negative dentries are created by kernel - caching the fact
    that a file was not available - which speeds up lookup but also
    takes a bit of memory. It seems that when compiled Python has
    the .pyc files removed, it tries to recreate them with timestamped
    entries every time new interpreter is started.
    
    While this is not a problem for long running processes - because
    those interpreters are run exactly once per container, this is
    a problem if you use `exec` in containers to run Health Checks.
    
    Evey health-check creates a new interpreter and every time it is
    created, a new negative dentries to take kernel memory.
    
    By not removing the .pyc files we increase a bit the size of the
    image but improve a little the startup time (no need to compile
    Python internal .py files, as well as get rid of the negative
    dentries problem.
    
    This PR likely:
    
    Fixes: #58509
    Fixes: #42195
---
 Dockerfile                                |  2 +-
 Dockerfile.ci                             |  2 +-
 docker-stack-docs/changelog.rst           | 12 ++++++++++++
 scripts/docker/install_os_dependencies.sh |  2 +-
 4 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/Dockerfile b/Dockerfile
index 179491e0d30..df0bceccb30 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -424,7 +424,7 @@ function install_python() {
     find /usr/python -depth \
       \( \
         \( -type d -a \( -name test -o -name tests -o -name idle_test \) \) \
-        -o \( -type f -a \( -name '*.pyc' -o -name '*.pyo' -o -name 
'libpython*.a' \) \) \
+        -o \( -type f -a \( -name 'libpython*.a' \) \) \
     \) -exec rm -rf '{}' +
     link_python
 }
diff --git a/Dockerfile.ci b/Dockerfile.ci
index b015822fbe6..b69ab45deda 100644
--- a/Dockerfile.ci
+++ b/Dockerfile.ci
@@ -363,7 +363,7 @@ function install_python() {
     find /usr/python -depth \
       \( \
         \( -type d -a \( -name test -o -name tests -o -name idle_test \) \) \
-        -o \( -type f -a \( -name '*.pyc' -o -name '*.pyo' -o -name 
'libpython*.a' \) \) \
+        -o \( -type f -a \( -name 'libpython*.a' \) \) \
     \) -exec rm -rf '{}' +
     link_python
 }
diff --git a/docker-stack-docs/changelog.rst b/docker-stack-docs/changelog.rst
index d1073bf5788..b0d5e192d2a 100644
--- a/docker-stack-docs/changelog.rst
+++ b/docker-stack-docs/changelog.rst
@@ -34,6 +34,18 @@ the Airflow team.
        any Airflow version from the ``Airflow 2`` line. There is no guarantee 
that it will work, but if it does,
        then you can use latest features from that image to build images for 
previous Airflow versions.
 
+Airflow 3.1.4
+~~~~~~~~~~~~~
+
+In Airflow 3.1.4, the images are build without removing of .pyc and .pyo files 
when Python is built.
+This increases the size of the image slightly (<0.5%), but improves 
performance of Python in the container
+because Python does not need to recompile the files on the first run but more 
importantly, if you use
+``exec`` to run Health Checks, removed .pyc files caused a small but ever 
growing memory leak in the Unix
+kernel connected to negative ``dentries`` created when .pyc files were 
attempted to be compiled and failed.
+This over time could lead to out-of-memory issues on the host running the 
container.
+
+More information about ``dentries`` can be found in `this article 
<https://lwn.net/Articles/814535/>`_.
+
 Airflow 3.1.0
 ~~~~~~~~~~~~~
 
diff --git a/scripts/docker/install_os_dependencies.sh 
b/scripts/docker/install_os_dependencies.sh
index 75151bb1f0e..26d01db8a5e 100644
--- a/scripts/docker/install_os_dependencies.sh
+++ b/scripts/docker/install_os_dependencies.sh
@@ -331,7 +331,7 @@ function install_python() {
     find /usr/python -depth \
       \( \
         \( -type d -a \( -name test -o -name tests -o -name idle_test \) \) \
-        -o \( -type f -a \( -name '*.pyc' -o -name '*.pyo' -o -name 
'libpython*.a' \) \) \
+        -o \( -type f -a \( -name 'libpython*.a' \) \) \
     \) -exec rm -rf '{}' +
     link_python
 }

Reply via email to