Re: std.csv Performance Review

Patrick Schluter via Digitalmars-d Sun, 04 Jun 2017 00:06:48 -0700

On Sunday, 4 June 2017 at 06:54:46 UTC, Patrick Schluter wrote:

On Sunday, 4 June 2017 at 06:15:24 UTC, H. S. Teoh wrote:
On Sun, Jun 04, 2017 at 05:41:10AM +0000, Jesse Phillips via(Note that this is much less of a limitation than it seems;for example you could use std.mmfile to memory-map the fileinto your address space so that it doesn't actually have tofit into memory, and you can still take slices of it. The OSwill manage the paging from/to disk for you. Of course, itwill be slower when something has to be paged from disk, butIME this is often much faster than if you read the data intomemory yourself.
If the file is in the file cache of the kernel, memory mappingdoes not need to reload the file as it is already in memory. Infact, calling mmap() changes only the sharing of the pages ingeneral. That's where most of the performance win from memorymapping comes from.

To be precise, it's the copying of data that is spared by mmap.If the file is in the file cache, the open/read/write/closesyscalls will also be fed from the memory mapped cache entry, butthis requires that the data is copied from the kernel memoryspace to the processes buffer space. So each call to read willhave to do this copying. So the gain from mmap comes for avoidingthe copy of memory and avoiding the syscalls read/write/seek. Theloading in memory of the physical file is the same in both cases.

This stackoverflow [1] discussion links to a realworldtechdiscussion with Linus Torvalds explaining it in detail. Onwindows and Solaris the mechanism is the same.
[1]https://stackoverflow.com/questions/5902629/mmap-msync-and-linux-process-termination/6219962#6219962

Re: std.csv Performance Review

Reply via email to