A Thursday 11 March 2010 14:35:49 Gael Varoquaux escrigué: > > So, in my experience, numpy.memmap is really using that large chunk of > > memory (unless my testbed is badly programmed, in which case I'd be > > grateful if you can point out what's wrong). > > OK, so what you are saying is that my assertion #1 was wrong. Fair > enough, as I was writing it I was thinking that I had no hard fact to > back it. How about assertion #2? I can think only of this 'story' to > explain why I can run parallel computation when I use memmap that blow up > if I don't use memmap.
Well, I must tell that I've not experience about running memmapped arrays in parallel computations, but it sounds like they can actually behave as shared- memory arrays, so yes, you may definitely be right for #2, i.e. memmapped data is not duplicated when accessed in parallel by different processes (in read- only mode, of course), which is certainly a very interesting technique to share data in parallel processes. Thanks for pointing out this! > Also, could it be that the memmap mode changes things? I use only the 'r' > mode, which is read-only. I don't think so. When doing the computation, I open the x values in read- only mode, and memory consumption is still there. > This is all very interesting, and you have much more insights on these > problems than me. Would you be interested in coming to Euroscipy in Paris > to give a 1 or 2 hours long tutorial on memory and IO problems and how > you address them with Pytables? It would be absolutely thrilling. I must > warn that I am afraid that we won't be able to pay for your trip, though, > as I want to keep the price of the conference low. Yes, no problem. I was already thinking about presenting something at EuroSciPy. A tutorial about PyTables/memory IO would be really great for me. We can nail the details off-list. -- Francesc Alted _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
