A short reply like this may be inadequate to explain virtual memory mechanisms 
if you have never heard of them before. That said, if you have heard in the 
past and forgotten this may help.

The `newMemMapFileStream` will call `memfiles.open` with default flags. Default 
flags typically just lookup the size of the file and create an address range in 
your process that -- when page faulted by the virtual memory hardware -- will 
(transparently to your process inside the OS kernels "page fault handler") 
cause loading/population of 4k (or possibly larger) "pages" of memory, 
on-demand with file contents for the corresponding spot. This is all fairly 
portable behavior.

So, in light of that, at the beginning of your loop, nothing will be "loaded". 
By the end of the loop, as much will be loaded as can fit in the RAM of your 
machine. The actual fact of the matter of "being loaded" depends upon the 
sometimes highly dynamic competition for physical memory among all the programs 
on a system. This is generally also true of any buffered IO mechanism when 
"swap files" or "page files" or partitions are enabled. Each page will be 
loaded for a little while or your program cannot make progress, but by the time 
you get to the end of your loop if the file is gigantic/larger than RAM or if 
some other process is demanding a lot of RAM then the beginning may no longer 
be "resident" in RAM.

Certain operating systems allow you to "tune" the on-demand loading behavior 
with "flag" arguments to the API that sets up this "auto-loading" mechanism. 
For example, Linux allows you to specify `MAP_POPULATE` which will, in effect, 
pre-load the whole file into RAM **before** your program loop/without your 
program making the CPU dereference any of those file data addresses. You may 
want to do this for example if the persistent backing store is a magnetic 
spinning disk, the file is small and yo want to avoid "seeking" the disk head 
around. Similarly, on Unix, there are also the `madvise/posix_madvise` 
interfaces which lets a program advise the OS that memory accesses are likely 
to be sequential (your case) or random, or even specify certain ranges as 
candidates for preloading. These little tweaks tend to be very non-portable, 
though, and the default behavior probably does what you want.

If it does not do what you want, `MemMapFileStream` does not (presently) 
support adding "flags" to the OS mapping calls. I did recently improve the 
`memfiles.open` interface to allow just that. You might like the non-stream API 
better anyway. You can `cast[ptr UncheckedArray[char]]` the `MemFile.pointer` 
and just use the file as an array of bytes if you like. You do have to be 
careful not to overrun the end of the file. And another recent addition I got 
in was to allow `toOpenArray(cast[ptr UncheckedArray[char]](ThePointer), 0, 
TheFileSize-1)` style passing of such arguments to Nim procs expecting 
OpenArray[char] parameters.

Reply via email to