arkadiuszbach commented on issue #42195:
URL: https://github.com/apache/airflow/issues/42195#issuecomment-3586845274

   I gave it another shot and i think was able to figure this out
   
   This is happening: https://lwn.net/Articles/814535/:
   
   - airflow user does not have permission to write into __pycache__ 
subdirectories under: `/usr/local/lib/python3.12/`
   - for most of the libararies there are already precompiled .pyc files, but 
if there isn't then when importing/starting, python will try creating such
     - first it checks if compiled .pyc file exists, if so, it tries to read 
it: 
       `openat(AT_FDCWD</opt/airflow>, 
"/usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc", 
O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) <0.000152>`
     - if it didnt exists then creates the .pyc file with number suffix after 
.pyc, some intermediate compiled file (**This creates negative dentry**):
       `openat(AT_FDCWD</opt/airflow>, 
"/usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc.136405097451568",
 O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0644) = 
3</usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc.136405097451568>
 <0.004813>
   `
     - then it renames the file (removes the number suffix), probably so that 
next runs of other python processes won't have to compile the same file again 
when importing
       ```
       
newfstatat(3</usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc.136405097451568>,
 "", {st_mode=S_IFREG|0644, st_size=0, ...}, AT_EMPTY_PATH) = 0 <0.000136>
       
write(3</usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc.136405097451568>,
 
"\313\r\r\n\0\0\0\0\277\306Hh\374\26\0\0\343\0\0\0\0\0\0\0\0\0\0\0\0\6\0\0"..., 
5781) = 5781 <0.000161>
       
close(3</usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc.136405097451568>)
 = 0 <0.000091>
       
rename("/usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc.136405097451568",
 "/usr/local/lib/python3.12/encodings/__pycache__/__init__.cpython-312.pyc") = 
0 <0.000239>
       ```
       `__init__.cpython-312.pyc.136405097451568` was renamed to  
`__init__.cpython-312.pyc`
   
   - airflow user is not able to create the file, each time healthprobe is 
triggered python generates different intermediate name for the .pyc file, each 
gets its entry in the Dentries, hence the slow memory increase
   
   
   Possible fixes:
   - disable precompilation entirely via 
https://docs.python.org/3/using/cmdline.html#envvar-PYTHONDONTWRITEBYTECODE - 
but the probe is already heavy, so doesn't seem like a good idea
   - precompile the files within Docker image: sudo python3 -m compileall 
/usr/local/lib/python3.12
   - give airflow user write permission to all of the __pycache__ directories 
under /usr/local/lib/python3.12
   
   
   The commands i used to figure this out:
   - `cat /sys/fs/cgroup/memory.stat | grep slab` - container slab memory usage 
(dentry is part of slab)
   - `cat /proc/sys/fs/dentry-state` - 
https://www.kernel.org/doc/html/v6.6/admin-guide/sysctl/fs.html#dentry-state, 
used to track negative dentries count
   - `strace -T -e trace=desc,open,mkdir,rmdir,unlink,rename --decode-fds=all 
python` - display file operations by command, search for dynamic file names
   - trace dentries add/kill requests via bpftrace 
     ```
       # required by bpftrace 
       sudo mkdir -p /sys/kernel/tracing
       sudo mount -t tracefs nodev /sys/kernel/tracing
       sudo mkdir -p /sys/kernel/debug
       sudo mount -t debugfs nodev /sys/kernel/debug
      
       sudo bpftrace -e '
       #define TARGET_COMM "python" 
       kprobe:d_alloc
       /(comm == TARGET_COMM)/
       {
           $dentry = (struct dentry *)arg0;
           $parent_name = str($dentry->d_parent->d_name.name);
           $parent_parent_name = str($dentry->d_parent->d_parent->d_name.name);
           @add_process[comm] = count();
           printf("ADD | PARENT: %s/%s | FILE: %s | Negative?: %p | Dentry 
Addr: %p | Caller: %s(%d)\n",
                  $parent_parent_name,
                  $parent_name,
                  str($dentry->d_name.name),
                  $dentry->d_inode,
                  arg0,
                  comm,
                  pid);
           
       }
       
       kprobe:__dentry_kill
       /(comm == TARGET_COMM)/
       {
           $main = (struct dentry *)arg0;
           @delete_process[comm] = count();
           printf("KILL %s; negative?: %p; a: %p; caller: %s(%d)\n",
                  str($main->d_name.name), $main->d_inode, arg0, comm, pid);
       
       }
       '
     ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to