arkadiuszbach commented on issue #42195: URL: https://github.com/apache/airflow/issues/42195#issuecomment-3586845274
I gave it another shot and i think was able to figure this out This is happening: https://lwn.net/Articles/814535/: - airflow user does not have permission to write into __pycache__ subdirectories under: `/usr/local/lib/python3.12/` - for most of the libararies there are already precompiled .pyc files, but if there isn't then when importing/starting, python will try creating such - first it checks if compiled .pyc file exists, if so, it tries to read it: `openat(AT_FDCWD</opt/airflow>, "/usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) <0.000152>` - if it didnt exists then creates the .pyc file with number suffix after .pyc, some intermediate compiled file (**This creates negative dentry**): `openat(AT_FDCWD</opt/airflow>, "/usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc.136405097451568", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0644) = 3</usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc.136405097451568> <0.004813> ` - then it renames the file (removes the number suffix), probably so that next runs of other python processes won't have to compile the same file again when importing ``` newfstatat(3</usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc.136405097451568>, "", {st_mode=S_IFREG|0644, st_size=0, ...}, AT_EMPTY_PATH) = 0 <0.000136> write(3</usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc.136405097451568>, "\313\r\r\n\0\0\0\0\277\306Hh\374\26\0\0\343\0\0\0\0\0\0\0\0\0\0\0\0\6\0\0"..., 5781) = 5781 <0.000161> close(3</usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc.136405097451568>) = 0 <0.000091> rename("/usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc.136405097451568", "/usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc") = 0 <0.000239> ``` `__init__.cpython-312.pyc.136405097451568` was renamed to `__init__.cpython-312.pyc` - airflow user is not able to create the file, each time healthprobe is triggered python generates different intermediate name for the .pyc file, each gets its entry in the Dentries, hence the slow memory increase Possible fixes: - disable precompilation entirely via https://docs.python.org/3/using/cmdline.html#envvar-PYTHONDONTWRITEBYTECODE - but the probe is already heavy, so doesn't seem like a good idea - precompile the files within Docker image: sudo python3 -m compileall /usr/local/lib/python3.12 - give airflow user write permission to all of the __pycache__ directories under /usr/local/lib/python3.12 The commands i used to figure this out: - `cat /sys/fs/cgroup/memory.stat | grep slab` - container slab memory usage (dentry is part of slab) - `cat /proc/sys/fs/dentry-state` - https://www.kernel.org/doc/html/v6.6/admin-guide/sysctl/fs.html#dentry-state, used to track negative dentries count - `strace -T -e trace=desc,open,mkdir,rmdir,unlink,rename --decode-fds=all python` - display file operations by command, search for dynamic file names - trace dentries add/kill requests via bpftrace ``` # required by bpftrace sudo mkdir -p /sys/kernel/tracing sudo mount -t tracefs nodev /sys/kernel/tracing sudo mkdir -p /sys/kernel/debug sudo mount -t debugfs nodev /sys/kernel/debug sudo bpftrace -e ' #define TARGET_COMM "python" kprobe:d_alloc /(comm == TARGET_COMM)/ { $dentry = (struct dentry *)arg0; $parent_name = str($dentry->d_parent->d_name.name); $parent_parent_name = str($dentry->d_parent->d_parent->d_name.name); @add_process[comm] = count(); printf("ADD | PARENT: %s/%s | FILE: %s | Negative?: %p | Dentry Addr: %p | Caller: %s(%d)\n", $parent_parent_name, $parent_name, str($dentry->d_name.name), $dentry->d_inode, arg0, comm, pid); } kprobe:__dentry_kill /(comm == TARGET_COMM)/ { $main = (struct dentry *)arg0; @delete_process[comm] = count(); printf("KILL %s; negative?: %p; a: %p; caller: %s(%d)\n", str($main->d_name.name), $main->d_inode, arg0, comm, pid); } ' ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
