Charles R Harris : > > > On Sat, Oct 23, 2010 at 10:15 AM, Charles R Harris > <[email protected] <mailto:[email protected]>> wrote: > > > > On Sat, Oct 23, 2010 at 9:44 AM, braingateway > <[email protected] <mailto:[email protected]>> wrote: > > David Cournapeau : > > 2010/10/23 braingateway <[email protected] > <mailto:[email protected]>>: > > > Hi everyone, > I noticed the numpy.memmap using RAM to buffer data > from memmap files. > If I get a 100GB array in a memmap file and process it > block by block, > the RAM usage is going to increasing with the process > running until > there is no available space in RAM (4GB), even though > the block size is > only 1MB. > for example: > #### > a = numpy.memmap(‘a.bin’, dtype='float64', mode='r') > blocklen=1e5 > b=npy.zeros((len(a)/blocklen,)) > for i in range(0,len(a)/blocklen): > b[i]=npy.mean(a[i*blocklen:(i+1)*blocklen]) > #### > Is there any way to restrict the memory usage in > numpy.memmap? > > > > The whole point of using memmap is to let the OS do the > buffering for > you (which is likely to do a better job than you in many > cases). Which > OS are you using ? And how do you measure how much memory > is taken by > numpy for your array ? > > David > _______________________________________________ > > > Hi David, > > I agree with you about the point of using memmap. That is why > the behavior is so strange to me. > I actually measure the size of resident set (pink trace in > figure2) of the python process on Windows. Here I attached the > result. You can see the RAM usage is definitely not file > system cache. > > > Umm, a good operating system will use *all* of ram for buffering > because ram is fast and it assumes you are likely to reuse data > you have already used once. If it needs some memory for something > else it just writes a page to disk, if dirty, and reads in the new > data from disk and changes the address of the page. Where you get > into trouble is if pages can't be evicted for some reason. Most > modern OS's also have special options available for reading in > streaming data from disk that can lead to significantly faster > access for that sort of thing, but I don't think you can do that > with memmapped files. > > I'm not sure how windows labels it's memory. IIRC, Memmaping a > file leads to what is called file backed memory, it is essentially > virtual memory. Now, I won't bet my life that there isn't a > problem, but I think a misunderstanding of the memory information > is more likely. > > > It is also possible that something else in your program is hanging > onto memory but without knowing a lot more it is hard to tell. Are you > seeing symptoms besides the memory graphs? It looks like you aren't > running on windows, actually, so what OS are you running on? > > Chuck > ------------------------------------------------------------------------ > > Hi Chuck,
Thanks a lot for quick response. I do run following supper simple script on windows: #### a = numpy.memmap(‘a.bin’, dtype='float64', mode='r') blocklen=1e5 b=npy.zeros((len(a)/blocklen,)) for i in range(0,len(a)/blocklen): b[i]=npy.mean(a[i*blocklen:(i+1)*blocklen]) #### Everything became supper slow after python ate all the RAM. By the way, I also tried Qt QFile::map() there is no problem at all... LittleBigBrain _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
