Re: randomIO, std.file, core.stdc.stdio

Steven Schveighoffer via Digitalmars-d-learn Tue, 26 Jul 2016 13:06:12 -0700

On 7/26/16 3:30 PM, Charles Hixson via Digitalmars-d-learn wrote:

On 07/26/2016 11:31 AM, Steven Schveighoffer via Digitalmars-d-learn wrote:

Now, C i/o's buffering may not suit your exact needs. So I don't know
how it will perform. You may want to consider mmap which tells the
kernel to link pages of memory directly to disk access. Then the
kernel is doing all the buffering for you. Phobos has support for it,
but it's pretty minimal from what I can see:
http://dlang.org/phobos/std_mmfile.html

I've considered mmapfile often, but when I read the documentation I end
up realizing that I don't understand it.  So I look up memory mapped
files in other places, and I still don't understand it.  It looks as if
the entire file is stored in memory, which is not at all what I want,
but I also can't really believe that's what's going on.


Of course that isn't what is happening :)

What happens is that the kernel says memory page 0x12345 (or whatever)is mapped to the file. Then when you access a mapped page, the systemmemory management unit gets a page fault (because that memory isn'tloaded), which triggers the kernel to load that page of memory. Kernelsees that the memory is really mapped to that file, and loads the pagefrom the file instead. As you write to the memory location, the page ismarked dirty, and at some point, the kernel flushes that page back to disk.

Everything is done behind the scenes and is in tune with the filesystemitself, so you get a little extra benefit from that.

I know that
there was an early form of this in a version of BASIC (the version that
RISS was written in, but I don't remember which version that was) and in
*that* version array elements were read in as needed.  (It wasn't
spectacularly efficient.)  But memory mapped files don't seem to work
that way, because people keep talking about how efficient they are.  Do
you know a good introductory tutorial?  I'm guessing that "window size"
might refer to the number of bytes available, but what if you need to
append to the file?  Etc.

To be honest, I'm not super familiar with actually using them, I justhave a rough idea of how they work. The actual usage you will have tolook up.

A part of the problem is that I don't want this to be a process with an
arbitrarily high memory use.

You should know that you can allocate as much memory as you want, aslong as you have address space for it, and you won't actually map thatto physical memory until you use it. So the management of the memory isdone lazily, all supported by the MMU hardware. This is true for actualmemory too!

Note that the only "memory" you are using for the mmaped file are pagebuffers in the kernel which are likely already being used to buffer thedisk reads. It's not like it's loading the entire file into memory, andprobably doesn't even load all sequential pages into memory. It onlyloads the ones you use.

I'm pretty much at my limit for knowledge of this subject (and maybe Ihave a few things incorrect), I'm sure others here know much more. Isuggest you play a bit with it to see what the performance is like. Ihave also heard that it's very fast.


-Steve

Re: randomIO, std.file, core.stdc.stdio

Reply via email to