Re: Newbie question about text encoding

Dave Angel Fri, 27 Feb 2015 06:05:01 -0800

On 02/27/2015 06:54 AM, Steven D'Aprano wrote:

Dave Angel wrote:

On 02/27/2015 12:58 AM, Steven D'Aprano wrote:

Dave Angel wrote:

(Although I believe Seymour Cray was quoted as saying that virtual
memory is a crock, because "you can't fake what you ain't got.")


If I recall correctly, disk access is about 10000 times slower than RAM,
so virtual memory is *at least* that much slower than real memory.


It's so much more complicated than that, that I hardly know where to
start.


[snip technical details]

As interesting as they were, none of those details will make swap faster,
hence my comment that virtual memory is *at least* 10000 times slower than
RAM.

The term "virtual memory" is used for many aspects of the modern memoryarchitecture. But I presume you're using it in the sense of "running ina swapfile" as opposed to running in physical RAM.

Yes, a page fault takes on the order of 10,000 times as long as anaccess to a location in L1 cache. I suspect it's a lot smaller thoughif the swapfile is on an SSD drive. The first byte is that slow.

But once the fault is resolved, the nearby bytes are in physical memory,and some of them are in L3, L2, and L1. So you're not running in theswapfile any more. And even when you run off the end of the page,fetching the sequentially adjacent page from a hard disk is much faster.And if the disk has well designed buffering, faster yet. The OS triespretty hard to keep the swapfile unfragmented.

The trick is to minimize the number of page faults, especially to randomlocations. If you're getting lots of them, it's called thrashing.

There are tools to help with that. To minimize page faults on code,linking with a good working-set-tuner can help, though I don't hear ofpeople bothering these days. To minimize page faults on data, choosingone's algorithm carefully can help. For example, in scanning through atypical matrix, row order might be adjacent locations, while columnorder might be scattered.

Not really much different than reading a text file. If you can arrangeto process it a line at a time, rather than reading the whole file intomemory, you generally minimize your round-trips to disk. And if youneed to randomly access it, it's quite likely more efficient to memorymap it, in which case it temporarily becomes part of the swapfile system.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list

Re: Newbie question about text encoding

Reply via email to