(Re-raising an issue that was brought up last year: [1]) Since the Enthought webinar on memmap-ing numpy arrays[2] suggested PyTables for creating new files (see slide 30 at [3]), I assumed by association that PyTables mem-mapped the data also. I switched an algorithm that kept data in memory over to use PyTables, and sure enough memory usage dropped dramatically, but now coming back to it, I find that performance took a big hit. Upon closer investigation, no, PyTables doesn't mmap. Oops.
(Use case: we have a read-only matrix that's an array of vectors. Given a probe vector, we want to find the top n vectors closest to it, measured by dot product. numpy's dot function does exactly what we want. But this runs in a multiprocess server, and these matrices are largeish, so I thought memmap would be a good way to let the OS handle sharing the matrix between the processes.) (Array _columns_ are stored contiguously, right?) Since PyTables doesn't currently do what I thought it did, we'll probably move to using memmapped ndarrays directly, as the webinar describes. But the natural question is, could PyTables possibly do what I thought it could? It might be very hard to handle compressed data, but uncompressed data seems possible; if the data is contiguous in the HDF5 file, all we really need is a way to get that data in memory, or at least its offset into the file. Poking around the HDF5 api[4], I don't see an obvious way to do that, but I do wonder if anyone has given it any thought. Thanks, -Ken [1] http://sourceforge.net/mailarchive/message.php?msg_id=200809271036.50004.faltet%40pytables.com [2] http://www.enthought.com/training/SCPwebinar.php#w2009-05-22 [3] http://www.slideshare.net/enthought/python-for-scientific-computing-webinar-may-22-2009 [4] http://www.hdfgroup.org/HDF5/doc/RM/RM_H5Front.html ------------------------------------------------------------------------------ _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users