Hi Demolishor, A Thursday 25 March 2010 16:09:08 Craig the Demolishor escrigué: > Hi folks, > I've recently written a C++ program which writes out some data in HDF5 > format, conforming to the file description in Appendix F of the PyTables > docs. When I open this file using PyTables I can do everything I would > expect to do with a PyTables-generated file, like queries and iterate over > rows and all that. My problem is that when I create a file with identical > data from within PyTables, and then I run queries on *that* file, they run > twice as fast. Both files have no filters on, no compression or anything > like that. I also let the PyTables-generated file determine my chunksize > when I create my C++-generated HDF5 file, so when I look at the > "chunkshape" attribute of both tables, they are the same. Is there another > parameter or something I am missing that could cause this slowdown?
Well, if you are really cloning the PyTables metainformation, I cannot see how queries can be slower with your approach. I'd suggest you to try with the h5diff utility to compare both files. Perhaps that could shed some light. > I would like to provide an example but I only see the effect on large files > (100MB, 100K rows)...I will try and spend some time to see if I can > reproduce it with a more email-friendly filesize. Cheers, -- Francesc Alted ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users