On Thu, Aug 28, 2025 at 11:01 PM Serge E. Hallyn <se...@hallyn.com> wrote: > On Wed, Aug 27, 2025 at 05:32:02PM -0700, Andy Lutomirski wrote: > > On Wed, Aug 27, 2025 at 5:14 PM Aleksa Sarai <cyp...@cyphar.com> wrote: > > > > > > On 2025-08-26, Mickaël Salaün <m...@digikod.net> wrote: > > > > On Tue, Aug 26, 2025 at 11:07:03AM +0200, Christian Brauner wrote: > > > > > Nothing has changed in that regard and I'm not interested in stuffing > > > > > the VFS APIs full of special-purpose behavior to work around the fact > > > > > that this is work that needs to be done in userspace. Change the apps, > > > > > stop pushing more and more cruft into the VFS that has no business > > > > > there. > > > > > > > > It would be interesting to know how to patch user space to get the same > > > > guarantees... Do you think I would propose a kernel patch otherwise? > > > > > > You could mmap the script file with MAP_PRIVATE. This is the *actual* > > > protection the kernel uses against overwriting binaries (yes, ETXTBSY is > > > nice but IIRC there are ways to get around it anyway). > > > > Wait, really? MAP_PRIVATE prevents writes to the mapping from > > affecting the file, but I don't think that writes to the file will > > break the MAP_PRIVATE CoW if it's not already broken. > > > > IPython says: > > > > In [1]: import mmap, tempfile > > > > In [2]: f = tempfile.TemporaryFile() > > > > In [3]: f.write(b'initial contents') > > Out[3]: 16 > > > > In [4]: f.flush() > > > > In [5]: map = mmap.mmap(f.fileno(), f.tell(), flags=mmap.MAP_PRIVATE, > > prot=mmap.PROT_READ) > > > > In [6]: map[:] > > Out[6]: b'initial contents' > > > > In [7]: f.seek(0) > > Out[7]: 0 > > > > In [8]: f.write(b'changed') > > Out[8]: 7 > > > > In [9]: f.flush() > > > > In [10]: map[:] > > Out[10]: b'changed contents' > > That was surprising to me, however, if I split the reader > and writer into different processes, so
Testing this in python is a terrible idea because it obfuscates the actual syscalls from you. > P1: > f = open("/tmp/3", "w") > f.write('initial contents') > f.flush() > > P2: > import mmap > f = open("/tmp/3", "r") > map = mmap.mmap(f.fileno(), f.tell(), flags=mmap.MAP_PRIVATE, > prot=mmap.PROT_READ) > > Back to P1: > f.seek(0) > f.write('changed') > > Back to P2: > map[:] > > Then P2 gives me: > > b'initial contents' Because when you executed `f.write('changed')`, Python internally buffered the write. "changed" is never actually written into the file in your example. If you add a `f.flush()` in P1 after this, running `map[:]` in P2 again will show you the new data.