Right, so, getting back to the original discussion, Bakul, I think the right path forward is to implement a device that supports an external pager, rather than mmap.
But code wins, so, quick, somebody, implement something :-) On Thu, Jan 8, 2026 at 11:03 PM David Leimbach via 9fans <[email protected]> wrote: > I’d been impressed by L4. It’s certainly been deployed pretty broadly. > > And it has recursive pagers but … not sure how that’s used in practice. > > And there are a bunch of variants. > Sent from my iPhone > > > On Jan 8, 2026, at 9:23 PM, [email protected] wrote: > > > > I vaguely remember someone being quoted as saying > > > > Microkernels don't have to be small. They just have to > > not do much. > > > > :-) > > > > ron minnich <[email protected]> wrote: > > > >> I would not tar the idea of external pagers with the Mach tarbrush. Mach > >> was pretty much inefficient at everything, including external pagers. > >> External pagers can work well, when implemented well. > >> > >>> On Thu, Jan 8, 2026 at 8:41 PM Paul Lalonde <[email protected]> > >>> wrote: > >>> > >>> Did the same on GPUs/Xeon Phi, including in the texture units. Very > >>> useful mechanism for abstracting compute with random access > characteristics. > >>> > >>> Paul > >>> > >>>> On Wed, Jan 7, 2026, 1:35 p.m. ron minnich <[email protected]> > wrote: > >>>> what we had planned for harvey was a good deal simpler: designate a > part > >>>> of the address space as a "bounce fault to user" space area. > >>>> > >>>> When a page fault in that area occurred, info about the fault was > sent to > >>>> an fd (if it was opened) or a note handler. > >>>> > >>>> user could could handle the fault or punt, as it saw fit. The fixup > was > >>>> that user mode had to get the data to satisfy the fault, then tell the > >>>> kernel what to do. > >>>> > >>>> This is much like the 35-years-ago work we did on AIX, called > >>>> external pagers at the time; or the more recent umap work, > >>>> https://computing.llnl.gov/projects/umap, used fairly widely in HPC. > >>>> > >>>> If you go this route, it's a bit less complex than what you are > proposing. > >>>> > >>>> On Wed, Jan 7, 2026 at 1:09 PM Bakul Shah via 9fans <[email protected]> > >>>> wrote: > >>>> > >>>>> > >>>>> > >>>>>> On Jan 7, 2026, at 8:41 AM, [email protected] wrote: > >>>>>> > >>>>>> Quoth Bakul Shah via 9fans <[email protected]>: > >>>>>>> I have this idea that will horrify most of you! > >>>>>>> > >>>>>>> 1. Create an mmap device driver. You ask it to a new file handle > >>>>> which you use to communicate about memory mapping. > >>>>>>> 2. If you want to mmap some file, you open it and write its file > >>>>> descriptor along with other parameters (file offset, base addr, > size, mode, > >>>>> flags) to your mmap file handle. > >>>>>>> 3. The mmap driver sets up necessary page table entries but doesn't > >>>>> actually fetch any data before returning from the write. > >>>>>>> 4. It can asynchronously kick off io requests on your behalf and > >>>>> fixup page table entries as needed. > >>>>>>> 5. Page faults in the mmapped area are serviced by making > appropriate > >>>>> read/write calls. > >>>>>>> 6. Flags can be used to indicate read-ahead or write-behind for > >>>>> typical serial access. > >>>>>>> 7. Similarly msync, munmap etc. can be implemented. > >>>>>>> > >>>>>>> In a sneaky way this avoids the need for adding any mmap specific > >>>>> syscalls! But the underlying work would be mostly similar in either > case. > >>>>>>> > >>>>>>> The main benefits of mmap are reduced initial latency , "pay as you > >>>>> go" cost structure and ease of use. It is certainly more expensive > than > >>>>> reading/writing the same amount of data directly from a program. > >>>>>>> > >>>>>>> No idea how horrible a hack is needed to implement such a thing or > >>>>> even if it is possible at all but I had to share this ;-) > >>>>>> > >>>>>> To what end? The problems with mmap have little to do with adding a > >>>>> syscall; > >>>>>> they're about how you do things like communicating I/O errors. > >>>>> Especially > >>>>>> when flushing the cache. > >>>>>> > >>>>>> Imagine the following setup -- I've imported 9p.io: > >>>>>> > >>>>>> 9fs 9pio > >>>>>> > >>>>>> and then I map a file from it: > >>>>>> > >>>>>> mapped = mmap("/n/9pio/plan9/lib/words", OWRITE); > >>>>>> > >>>>>> Now, I want to write something into the file: > >>>>>> > >>>>>> *mapped = 1234; > >>>>>> > >>>>>> The cached version of the page is dirty, so the OS will > >>>>>> eventually need to flush it back with a 9p Twrite; Let's > >>>>>> assume that before this happens, the network goes down. > >>>>>> > >>>>>> How do you communicate the error with userspace? > >>>>> > >>>>> This was just a brainwave but... > >>>>> > >>>>> You have a (control) connection with the mmap device to > >>>>> set up mmap so might as well use it to convey errors! > >>>>> This device would be strictly local to where a program > >>>>> runs. > >>>>> > >>>>> I'd even consider allowing a separate process to mmap, > >>>>> by making an address space a first class object. That'd > >>>>> move more stuff out of the kernel and allow for more > >>>>> interesting/esoteric uses. > >>>> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see discussions > >>> <https://9fans.topicbox.com/groups/9fans> + participants > >>> <https://9fans.topicbox.com/groups/9fans/members> + delivery options > >>> <https://9fans.topicbox.com/groups/9fans/subscription> Permalink > >>> < > https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mf3cfeeb18fd00292d3f9063f ------------------------------------------ 9fans: 9fans Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-Mfe8a1e4feddaad7bebb650ea Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
