On Wed, Feb 11, 2026 at 10:36 AM hiro <[email protected]> wrote:
> i'm not sure i understand even the most abstract topic discussed here, what's
> the advantage of logically organizing your data in size constrained unnamed
> page tables as opposed to files in a named tree?
You don't. You're still using named files; this affects how you access
them: does that look like loads and stores from and to memory, or does
it look like explicit system calls a la `read` and `write`?
> in the end you are assuming your memloads are not gonna be translated into
> some message passing protocol underneath? are you sure you can treat all your
> big data like one continuous memory region?
Depends on the data. For the sorts of things we're talking about with
`mmap`, you open a file, and then you map some region within the file
to some part of your address space; you don't have to map the whole
thing, just the part you're interested in. If it doesn't fit, you get
an error.
> "The main benefits of mmap are reduced initial latency" -> latency from where
> to where? what kind information are you transmitting in that procedure?
Latency from time of `exec` to running the new program, usually.
There are kind of two ways to do it; when you load a binary, you could
read all of the bits of it out of the binary executable image and
eagerly copy them into memory, but that might take a while if the
executable is big. But once you're done, you start it running, and if
it ever faults that's probably an error, so you just kill it.
Or, you can read the relatively small headers at the front of the
binary, and use that information to reserve regions of address space
and their properties, and figure out where the program is supposed to
start executing. Then you just start trying to run, without reading
anything else. But, that's going to page fault pretty much
immediately, like on the first instruction. So you can arrange for the
fault handler to detect that the fault was in a mapped, but
unpopulated, region of memory, read the relevant page of memory from
the underlying executable file, patch that into the address space, and
then return to userspace and restart the faulting instruction, which
probably fault now (one of its operands might refer to still unmapped
memory). But the program will probably do something that will fault
again soon, at which point you just repeat the same process. You keep
doing that until the program exits or you detect a fault for something
outside of the program's mapped regions, or that perhaps in one of
those regions but in violation of the region's permissions (a store to
a read-only segment or something similar). This is "demand paging" in
a nutshell.
> what concrete problem are you trying to solve?
I can't speak to that, I'm afraid.
- Dan C.
> On Wed, Feb 11, 2026 at 4:34 AM Ori Bernstein <[email protected]> wrote:
>>
>> On Tue, 10 Feb 2026 05:13:47 -0500
>> "Alyssa M via 9fans" <[email protected]> wrote:
>>
>> > On Monday, February 09, 2026, at 3:24 PM, ron minnich wrote:
>> > > as for mmap, there's already a defacto mmap happening for executables.
>> > > They are not read into memory. In fact, the first instruction you run in
>> > > a binary results in a page fault.
>> > I thinking one could bring the same transparent/defacto memory mapping to
>> > read(2) and write(2), so the API need not change at all.
>>
>> That gets... interesting, from an FS semantics point of view.
>> What does this code print? Does it change with buffer sizes?
>>
>> fd = open("x", ORDWR);
>> pwrite(fd, "foo", 4, 0);
>> read(fd, buf, 4);
>> pwrite(fd, "bar", 4, 0);
>> print("%s\n", buf);
>>
>
> 9fans / 9fans / see discussions + participants + delivery options Permalink
------------------------------------------
9fans: 9fans
Permalink:
https://9fans.topicbox.com/groups/9fans/Te8d7c6e48b5c075b-M1e5415ad20881dd5ef3e0d59
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription