On Dec 21, 2007 6:45 AM, David Cournapeau <[EMAIL PROTECTED]> wrote: > Hans Meine wrote: > > Am Freitag, 21. Dezember 2007 13:23:49 schrieb David Cournapeau: > > > >>> Instead of saying "memmap is ALL about disc access" I would rather > >>> like to say that "memap is all about SMART disk access" -- what I mean > >>> is that memmap should run as fast as a normal ndarray if it works on > >>> the cached part of an array. Maybe there is a way of telling memmap > >>> when and what to cache and when to sync that cache to the disk. > >>> In other words, memmap should perform just like a in-pysical-memory > >>> array -- only that it once-in-a-while saves/load to/from the disk. > >>> Or is this just wishful thinking ? > >>> Is there a way of "pre loading" a given part into cache > >>> (pysical-memory) or prevent disc writes at "bad times" ? > >>> How about doing the sync from a different thread ;-) > >>> > >> mmap is using the OS IO caches, that's kind of the point of using mmap > >> (at least in this case). Instead of doing the caching yourself, the OS > >> does it for you, and OS are supposed to be smart about this :) > >> > > > > AFAICS this is what Sebastian wanted to say, but as the OP indicated, > > preloading e.g. by reading the whole array once did not work for him. > > Thus, I understand Sebastian's questions as "is it possible to help the > OS > > when it is not smart enough?". Maybe something along the lines of > mlock, > > only not quite as aggressive. > > > I don't know exactly why it did not work, but it is not difficult to > imagine why it could fail (when you read a 2 Gb file, it may not be > smart on average to put the whole file in the buffer, since everything > else is kicked out). It all depends on the situation, but there are many > different things which can influence this behaviour: the IO scheduler, > how smart the VM is, the FS (on linux, some FS are better than others > for RT audio dsp, and some options are better left out), etc... On > Linux, using the deadline IO scheduler can help, for example (that's the > recommended scheduler for IO intensive musical applications). > <snip>
> > But if what you want is to reliable being able to read "in real time" a > big file which cannot fit in memory, then you need a design where > something is doing the disk buffering as you want (again, taking the > example I am somewhat familiar with, in audio processing, you often have > a IO thread which does the pre-caching, and put the data into mlock'ed > buffers to another thread, the one which is RT). IIRC, Martin really wanted something like streaming IO broken up into smaller frames with previously cached results ideally discarded. Chuck
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion