On Thursday, February 12, 2026, at 4:53 AM, Dan Cross wrote: > On Wed, Feb 11, 2026 at 11:08 PM Alyssa M via 9fans <[email protected]> wrote: >> On Wednesday, February 11, 2026, at 10:01 AM, hiro wrote: >>> what concrete problem are you trying to solve? >> Making software simpler to write, I think. > I don't understand that. If the interface doesn't change, how is it simpler?
Think of a program that reads a file completely into memory, pokes at it a bit sparsely then writes the whole file out again. This is simple if the file is small. If the file gets big, you might start looking around for ways to not do all that I/O, and pretty soon you have a buffer cache implementation. So the program is now more complex. Not only is there a buffer cache implementation, but you have to use it everywhere, rather than just operating on memory. This is when mmap starts to look appealing. On Thursday, February 12, 2026, at 4:53 AM, Dan Cross wrote: > The other [use] is to map the contents of a file into an address space, so that you can treat them like memory, without first reading them from an actual file. This is useful for large but sparse read-only data files: I don't need to read the entire thing into physical memory; for that matter if may not even fit into physical memory. But if I mmap it, and just copy the bits I need, then those can be faulted into the address space on demand. So what I'm suggesting is that instead of the programmer making an mmap call, they should make a single read call to read the entire file into the address space - as they did before. The new read implementation would do this, but as a memory mapped snapshot. This looks no different to the programmer from how reads have always worked, it just happens very quickly, because no I/O actually happens. The snapshot data is brought in by demand paging as it is touched, and pages may get dirtied. When the programmer would otherwise call msync, they instead write out the entire file back where it came from - as they did before. The write implementation will recognise when it's overwriting the file where the snapshot came from and will only write the dirty pages - which is effectively what msync does. So from the programmer's point of view this is exactly what they've always done. The implementation uses c-o-w snapshots and demand paging which have the performance of mmap, but provide the conventional semantics of read and write. Programs can handle larger files faster without having to change. It's just an optimisation in the read/write implementation. So that's the idea. Is it practical? I don't know... It's certainly harder to do. One difference with mmap is that dirty pages don't get back to the file by themselves. You have to do the writes. But I think there may be ways to address this. On Thursday, February 12, 2026, at 4:53 AM, Dan Cross wrote: > The problem is, those aren't the right analogues for the file metaphor. `mmap` is closer to `open` than to `read` In the sense that mmap creates an association between pages and the file and munmap undoes that, yes. With the idea above the page association is with snapshots and is a bit more ephemeral, and I don't know yet how much it matters if it persists after it's no longer needed. Pages are disassociated from snapshots naturally by being dirtied, by being associated with something else or perhaps by memory being deallocated. It may be somewhat like file deletion. Sometimes when it's 'gone' it's not really gone until the last user lets go. I don't think it's a problem for the process, but it may be for the file system in some situations. ------------------------------------------ 9fans: 9fans Permalink: https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M8b80dba1c12ac630dda63f5c Delivery options: https://9fans.topicbox.com/groups/9fans/subscription
