A Friday 15 May 2009 15:40:16 David Fokkema escrigué: > Hi list, > > I don't get this (using pytables 2.1.1): > > In [1]: import tables > > In [2]: data = tables.openFile('data_new.h5', 'w') > > In [3]: data.createVLArray(data.root, 'nosee', > tables.Int32Atom())Out[3]: > /nosee (VLArray(0,)) '' > atom = Int32Atom(shape=(), dflt=0) > byteorder = 'little' > nrows = 0 > flavor = 'numpy' > > In [4]: data.createVLArray(data.root, 'see', tables.Int32Atom(), > filters=tables.Filters(complevel=1)) > Out[4]: > /see (VLArray(0,), shuffle, zlib(1)) '' > atom = Int32Atom(shape=(), dflt=0) > byteorder = 'little' > nrows = 0 > flavor = 'numpy' > > In [5]: a = 1000000 * [200] > > In [6]: for i in range(50): > ...: data.root.see.append(a) > ...: > ...: > > In [7]: data.flush() > > And looking at the file: > > 191M 2009-05-15 15:37 data_new.h5 > > Also writing to the uncompressed table, adds another 191 Mb to the file. > So, I really see no compression at all. I also tried zlib(9). Why are my > arrays not compressed? The repetitive values seem like a perfect > candidate for compression.
Yes, I can reproduce this. Well, at least it seems that PyTables is setting the filters correctly. For the 'see' dataset h5ls -v is reporting: Chunks: {2048} 32768 bytes Storage: 800 logical bytes, 391 allocated bytes, 204.60% utilization Filter-0: shuffle-2 OPT {16} Filter-1: deflate-1 OPT {1} Type: variable length of native int which clearly demonstrate that the filters are correctly installed in the HDF5 pipeline :-\ This definitely seems an HDF5 issue. To say the truth I've never seen good compression rates in VLArrays (although I'd never thought that compression was completely inexistent!). I'll try to report this to the hdf-forum list and get back to you. Cheers, -- Francesc Alted "One would expect people to feel threatened by the 'giant brains or machines that think'. In fact, the frightening computer becomes less frightening if it is used only to simulate a familiar noncomputer." -- Edsger W. Dykstra ------------------------------------------------------------------------------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users