On 02/27/2015 06:54 AM, Steven D'Aprano wrote:
Dave Angel wrote:

On 02/27/2015 12:58 AM, Steven D'Aprano wrote:
Dave Angel wrote:

(Although I believe Seymour Cray was quoted as saying that virtual
memory is a crock, because "you can't fake what you ain't got.")

If I recall correctly, disk access is about 10000 times slower than RAM,
so virtual memory is *at least* that much slower than real memory.


It's so much more complicated than that, that I hardly know where to
start.

[snip technical details]

As interesting as they were, none of those details will make swap faster,
hence my comment that virtual memory is *at least* 10000 times slower than
RAM.


The term "virtual memory" is used for many aspects of the modern memory architecture. But I presume you're using it in the sense of "running in a swapfile" as opposed to running in physical RAM.

Yes, a page fault takes on the order of 10,000 times as long as an access to a location in L1 cache. I suspect it's a lot smaller though if the swapfile is on an SSD drive. The first byte is that slow.

But once the fault is resolved, the nearby bytes are in physical memory, and some of them are in L3, L2, and L1. So you're not running in the swapfile any more. And even when you run off the end of the page, fetching the sequentially adjacent page from a hard disk is much faster. And if the disk has well designed buffering, faster yet. The OS tries pretty hard to keep the swapfile unfragmented.

The trick is to minimize the number of page faults, especially to random locations. If you're getting lots of them, it's called thrashing.

There are tools to help with that. To minimize page faults on code, linking with a good working-set-tuner can help, though I don't hear of people bothering these days. To minimize page faults on data, choosing one's algorithm carefully can help. For example, in scanning through a typical matrix, row order might be adjacent locations, while column order might be scattered.

Not really much different than reading a text file. If you can arrange to process it a line at a time, rather than reading the whole file into memory, you generally minimize your round-trips to disk. And if you need to randomly access it, it's quite likely more efficient to memory map it, in which case it temporarily becomes part of the swapfile system.

--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to