On Mon, Sep 1, 2025 at 4:06 AM Jann Horn <ja...@google.com> wrote: > > On Thu, Aug 28, 2025 at 11:01 PM Serge E. Hallyn <se...@hallyn.com> wrote: > > On Wed, Aug 27, 2025 at 05:32:02PM -0700, Andy Lutomirski wrote: > > > On Wed, Aug 27, 2025 at 5:14 PM Aleksa Sarai <cyp...@cyphar.com> wrote: > > > > > > > > On 2025-08-26, Mickaël Salaün <m...@digikod.net> wrote: > > > > > On Tue, Aug 26, 2025 at 11:07:03AM +0200, Christian Brauner wrote: > > > > > > Nothing has changed in that regard and I'm not interested in > > > > > > stuffing > > > > > > the VFS APIs full of special-purpose behavior to work around the > > > > > > fact > > > > > > that this is work that needs to be done in userspace. Change the > > > > > > apps, > > > > > > stop pushing more and more cruft into the VFS that has no business > > > > > > there. > > > > > > > > > > It would be interesting to know how to patch user space to get the > > > > > same > > > > > guarantees... Do you think I would propose a kernel patch otherwise? > > > > > > > > You could mmap the script file with MAP_PRIVATE. This is the *actual* > > > > protection the kernel uses against overwriting binaries (yes, ETXTBSY is > > > > nice but IIRC there are ways to get around it anyway). > > > > > > Wait, really? MAP_PRIVATE prevents writes to the mapping from > > > affecting the file, but I don't think that writes to the file will > > > break the MAP_PRIVATE CoW if it's not already broken. > > > > > > IPython says: > > > > > > In [1]: import mmap, tempfile > > > > > > In [2]: f = tempfile.TemporaryFile() > > > > > > In [3]: f.write(b'initial contents') > > > Out[3]: 16 > > > > > > In [4]: f.flush() > > > > > > In [5]: map = mmap.mmap(f.fileno(), f.tell(), flags=mmap.MAP_PRIVATE, > > > prot=mmap.PROT_READ) > > > > > > In [6]: map[:] > > > Out[6]: b'initial contents' > > > > > > In [7]: f.seek(0) > > > Out[7]: 0 > > > > > > In [8]: f.write(b'changed') > > > Out[8]: 7 > > > > > > In [9]: f.flush() > > > > > > In [10]: map[:] > > > Out[10]: b'changed contents' > > > > That was surprising to me, however, if I split the reader > > and writer into different processes, so > > Testing this in python is a terrible idea because it obfuscates the > actual syscalls from you. > > > P1: > > f = open("/tmp/3", "w") > > f.write('initial contents') > > f.flush() > > > > P2: > > import mmap > > f = open("/tmp/3", "r") > > map = mmap.mmap(f.fileno(), f.tell(), flags=mmap.MAP_PRIVATE, > > prot=mmap.PROT_READ) > > > > Back to P1: > > f.seek(0) > > f.write('changed') > > > > Back to P2: > > map[:] > > > > Then P2 gives me: > > > > b'initial contents' > > Because when you executed `f.write('changed')`, Python internally > buffered the write. "changed" is never actually written into the file > in your example. If you add a `f.flush()` in P1 after this, running > `map[:]` in P2 again will show you the new data. >
These days, one can type in Python, ask an LLM to translate to C, and get almost-correct output :) Or one can use os.write(), which is exactly what I should have done. --Andy