Dear Francesc, Thanks for the advise. I had tried some tutorials on PyTables and was under the impression that you could get higher performance than with memmory mapped arrays. I think I'm starting to get a better feel now of the respective advantages of memmory mapped arrays vs pytables. In my case it seems that memmory mapped arrays are the way to go.
Cheers, Sam On 3 March 2011 19:20, Francesc Alted <fal...@pytables.org> wrote: > A Thursday 03 March 2011 17:32:05 samuel sinayoko escrigué: > > Hi everyone, > > > > I'm a postdoc in fluid dynamics and acoustics. I need to compute > > Fourier transforms of big arrays (~4GB). I've been comparing various > > options to do so: > > - option 1: memmory mapped arrays (with numpy), using the first index > > to represent time frames > > - option 2: memmory mapped arrays (with numpy), using the last index > > to represent time frames > > - option 3: pytable CArray with last index to represent time frames > > > > I've put together a script to test these options here: > > http://dpaste.com/469184/ > > (I'll also paste the script below) > > > > The results I get are: > > Option 1: 1.310 sec > > Option 2: 1.033 sec > > Option 3: 3.318 sec > > > > This is without using compression. I was expecting PyTables to give > > the best performance. > > May I ask why do you expect PyTables giving the best performance? IMO, > memory mapped arrays are kind of optimal solutions for achieving maximum > I/O performance. You should use PyTables only if you need additional > advantages (like on-the-flight compression, hierarchical structures, > query large tables...). > > Having said this, I'd try using Array objects so as to avoid using > chunking in HDF5. That should offer higher I/O speed. > > Finally, I'd say that the only chance for CArray objects to bet memory > mapped files is by using compression (specially Blosc), but of course, > this will only work if your datasets are significantly compressible. > > Cheers, > > -- > Francesc Alted > > > ------------------------------------------------------------------------------ > Free Software Download: Index, Search & Analyze Logs and other IT data in > Real-Time with Splunk. Collect, index and harness all the fast moving IT > data > generated by your applications, servers and devices whether physical, > virtual > or in the cloud. Deliver compliance at lower cost and gain new business > insights. http://p.sf.net/sfu/splunk-dev2dev > _______________________________________________ > Pytables-users mailing list > Pytables-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/pytables-users >
------------------------------------------------------------------------------ Free Software Download: Index, Search & Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev
_______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users