Guys, I'm really sorry for the confusion.
A week ago I posted this, as a loose analogy, trying to relate what Ron said to
my earlier idea:
On Tuesday, February 10, 2026, at 10:13 AM, Alyssa M wrote:
> On Monday, February 09, 2026, at 3:24 PM, ron minnich wrote:
>> as for mmap, there's already a defacto mmap happening for executables. They
>> are not read into memory. In fact, the first instruction you run in a binary
>> results in a page fault.
> I thinking one could bring the same transparent/defacto memory mapping to
> read(2) and write(2), so the API need not change at all.
Unfortunately it got taken literally and I've been fire-fighting that ever
since:
On Thursday, February 12, 2026, at 3:12 AM, Alyssa M wrote:
> Uses of mmap(2) and msync(2) could be replaced with read(2) and write(2) that
> sometimes use memory mapping as part of their implementation.
On Thursday, February 12, 2026, at 8:37 AM, Alyssa M wrote:
> The new read implementation would do this, but as a memory mapped snapshot.
> This looks no different to the programmer from how reads have always worked,
On Saturday, February 14, 2026, at 3:35 AM, Alyssa M wrote:
> This is not mmap, and not really memory mapping in the conventional sense
> either. But I think it can do the things an mmap user is looking for
On Monday, February 16, 2026, at 2:24 AM, Alyssa M wrote:
> So this is not mmap by another name. It's an optimization of the standard
> read/write approach that has some of the desirable characteristics of mmap.
All of these statements are my clumsy attempts to explain the same thing. There
is memory mapping happening, in the sense that parts of the address space are
sometimes associated with parts of files. But that's where the similarity ends.
None of that is apparent to the programmer at any time. If it were it would be
broken. Obviously.
In some ways this is like lazy evaluation: if you were actually to use all the
data it would be slower. But if you only need part of it you can get access to
it in memory up front (with read) without knowing ahead of time which bits you
need.
This is not about making I/O faster, it's about doing less of it.
With regard to the snapshots:
Snapshots are just there to ensure that the proper read semantics are preserved.
On Monday, February 16, 2026, at 1:49 PM, Ori Bernstein wrote:
> Please describe, in detail, short of a per-file snapshot being
supported by the fs, what management to changes would actually
lead to zeros being read in the test C code?
When you're short of a per-file snapshot being supported by the fs, you get the
traditional code path in read:
On Monday, February 16, 2026, at 1:49 PM, Ori Bernstein wrote:
> in another terminal, while this process is sleeping,
you run this command:
dd -if /dev/random -of test -bs 16k -count 1kk
A normal read has happened because the kernel was unable to obtain a snapshot.
The zeroes are already in memory (assuming you can actually read 16GB of zeroes
from a file into the process within an hour!).
However:
It would be convenient for this if servers for disk file systems had the
ability to create a snapshot of a range of bytes from a file. But they
generally don't. So I'm building a file system wrapper layer (a file server
that talks to another file server) that provides snapshot files to the kernel
via an extension of the file server protocol, in addition to what the
underlying file system already provides. My current implementation does this
with temporary files. When a file is written, the temporary file gets any
original data that's about to be overwritten. The snapshot provided to the
kernel is a synthetic blend of the original file and any bytes that were
rescued and put in the temporary file. In most uses the original file will
never be touched by another process and the temporary file won't even be
created.
The wrapper requires exclusive access to the file server underneath, and also
requires the contents to be stable. It is the wrapper that is mounted in the
namespace. So the wrapper sees all attempts to alter any file, and can ensure
that that any snapshot maintains the illusion of being a full prior copy when
writes later happen to the file it came from.
So with the wrapper in place...
On Monday, February 16, 2026, at 1:49 PM, Ori Bernstein wrote:
> in another terminal, while this process is sleeping,
you run this command:
dd -if /dev/random -of test -bs 16k -count 1kk
The dd will cause the wrapper file system to copy the zeroes into a temporary
file in the snapshot as it replaces them in the file with random data - which
would presumably take three times as long as the earlier example), and the
process will wake up after an hour and fault zeroes in from one page in the
temporary file in the snapshot. The same happens if you delete or truncate the
file: the wrapper will save the data in the temporary file first.
The kernel always sees a stable snapshot (if it sees one at all), which appears
to be a readonly file the size of the read buffer. It can demand-load from that
without interference for as long as it's needed, and it will remove on close.
The wrapper can be local or remote - wherever the file system needs to be
shared.
On Monday, February 16, 2026, at 10:56 AM, Frank D. Engel, Jr. wrote:
> You could theoretically work around that by extending the protocol in a
detectable way to provide the required support and only enabling this
feature for filesystems which declare correct implementation of the
extensions
Yes indeed.
On my hobby OS the file server protocol is not 9P, and adding a new request
type to it is easy and expected.
For 9P it would probably be most sensible to do something like what I did for
Linux last year, and use some kind of control file to communicate the request.
It's kind of ugly, but can likely be done. Given how little this mechanism
would be likely to be used in practice that's probably reasonable.
There's still more to say about what happens to memory pages after a write
occurs, and about a really nice result that falls naturally out of all this
when write knows where the data came from. But given the responses I've seen so
far I'm not sure I dare...
On Monday, February 16, 2026, at 12:24 PM, hiro wrote:
> as you keep going in the same direction without responding to any of the
> doubt, i can't help but blame sycophantic AI, I don't manage to convince
> myself any more you're actually engaging with us properly as a human.
I don't post AI slop.
I hadn't actually intended to go into so much detail in this thread,
particularly as I can't see myself doing this on Plan 9 - though I'm sure
someone more expert than me could.
I'm dismayed by the responses so far, because I think this is potentially a lot
better than mmap.
------------------------------------------
9fans: 9fans
Permalink:
https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M160f64e4b54be3298727d64a
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription