what we had planned for harvey was a good deal simpler: designate a part of the address space as a "bounce fault to user" space area.
When a page fault in that area occurred, info about the fault was sent to an fd (if it was opened) or a note handler. user could could handle the fault or punt, as it saw fit. The fixup was that user mode had to get the data to satisfy the fault, then tell the kernel what to do. This is much like the 35-years-ago work we did on AIX, called external pagers at the time; or the more recent umap work, https://computing.llnl.gov/projects/umap, used fairly widely in HPC. If you go this route, it's a bit less complex than what you are proposing. On Wed, Jan 7, 2026 at 1:09 PM Bakul Shah via 9fans <[email protected]> wrote: > > > > On Jan 7, 2026, at 8:41 AM, [email protected] wrote: > > > > Quoth Bakul Shah via 9fans <[email protected]>: > >> I have this idea that will horrify most of you! > >> > >> 1. Create an mmap device driver. You ask it to a new file handle which > you use to communicate about memory mapping. > >> 2. If you want to mmap some file, you open it and write its file > descriptor along with other parameters (file offset, base addr, size, mode, > flags) to your mmap file handle. > >> 3. The mmap driver sets up necessary page table entries but doesn't > actually fetch any data before returning from the write. > >> 4. It can asynchronously kick off io requests on your behalf and fixup > page table entries as needed. > >> 5. Page faults in the mmapped area are serviced by making appropriate > read/write calls. > >> 6. Flags can be used to indicate read-ahead or write-behind for typical > serial access. > >> 7. Similarly msync, munmap etc. can be implemented. > >> > >> In a sneaky way this avoids the need for adding any mmap specific > syscalls! But the underlying work would be mostly similar in either case. > >> > >> The main benefits of mmap are reduced initial latency , "pay as you go" > cost structure and ease of use. It is certainly more expensive than > reading/writing the same amount of data directly from a program. > >> > >> No idea how horrible a hack is needed to implement such a thing or even > if it is possible at all but I had to share this ;-) > > > > To what end? The problems with mmap have little to do with adding a > syscall; > > they're about how you do things like communicating I/O errors. Especially > > when flushing the cache. > > > > Imagine the following setup -- I've imported 9p.io: > > > > 9fs 9pio > > > > and then I map a file from it: > > > > mapped = mmap("/n/9pio/plan9/lib/words", OWRITE); > > > > Now, I want to write something into the file: > > > > *mapped = 1234; > > > > The cached version of the page is dirty, so the OS will > > eventually need to flush it back with a 9p Twrite; Let's > > assume that before this happens, the network goes down. > > > > How do you communicate the error with userspace? > > This was just a brainwave but... > > You have a (control) connection with the mmap device to > set up mmap so might as well use it to convey errors! > This device would be strictly local to where a program > runs. > > I'd even consider allowing a separate process to mmap, > by making an address space a first class object. That'd > move more stuff out of the kernel and allow for more > interesting/esoteric uses. ------------------------------------------ 9fans: 9fans Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mae5eb9a90d72008533969f26 Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
