>> > Please, try again with the above directives, >> >> Ah, OK, stupid me. >> >> OK, as I indicated in a previous post, that is actually what I asked >> about there, and actually I had also tried already to >> copy lzo1.dll to >> C:\Python25\Lib\site-packages\tables >> >> but there the test gave the same warning that it could not find LZO >> >> However upon moving it to C:\Windows\System32 (man, I really dislike >> fiddling with dlls in that dir) it was capable to find it on doing an >> import tables; tables.test() >> >> Don't understand why it could not find it in the \tables folder > > So neither do I. Perhaps somebody knowing better Windows and its intricacies > can shed more light here. > > Well, at least you finally have LZO support in PyTables. > Yes, the important point is that it is possible to make it work now.
In my application it is write once and read many, and the data sizes are large, thus the LZO caught my attention from reading the Optimation tips Chapter, as it should be very fast in its decompression and the extra processing could be more than countercompensated by avaoiding a bottleneck in the file I/O speed from the HDDs. So I wrote myself a little test program with some test data having an entropy (compressability) approximately similar to my real data: import os from stat import ST_SIZE import time import tables as tb import numpy as np total_size = 10 ** 10 chunk_size = 5 * 10 ** 5 complib = 'lzo' max_comp_lvl = 9 h5name = 'd:/test.h5' dtype = np.dtype([('x', '<f4'), ('y', '<f4'), ('z', '<f4')]) recs = total_size / dtype.itemsize recs_per_chunk = chunk_size / dtype.itemsize test_data_chunk = np.empty(recs_per_chunk, dtype=dtype).view(np.recarray) test_data_chunk.x[:] = 1.0 + np.random.standard_normal(recs_per_chunk) test_data_chunk.y[:] = 1000.0 + np.random.standard_normal(recs_per_chunk) test_data_chunk.z[:] = 1000000000.0 + np.random.standard_normal(recs_per_chunk) print "Testing hd5f write, read performance of offset Gaussian noise data for %d bytes in chunks of %d bytes using %s compression:" %\ (total_size, chunk_size, complib) for complevel in xrange(max_comp_lvl +1): start_time = time.time() filters = tb.Filters(complib="lzo", complevel=complevel) h5 = tb.openFile(h5name, mode='w', filters=filters) test_tbl = h5.createTable(h5.root, "test", np.empty(0, dtype=dtype), expectedrows=recs, chunkshape=recs_per_chunk) bytes_written = 0 while bytes_written < total_size: test_tbl.append(test_data_chunk) bytes_written += chunk_size h5.close() elapsed = time.time() - start_time data_rate = 1.0 * total_size / elapsed print "Write test with compression level %d: %6.1f MB/s" % (complevel, data_rate * 1.0e-6) h5compression = 1.0 * total_size / os.stat(h5name)[ST_SIZE] print "HDF5 file compressed by 1:%5.3f" % h5compression start_time = time.time() h5 = tb.openFile('d:/test.h5', mode='r') test_tbl = h5.root.test for start in xrange(0, recs, recs_per_chunk): test_tbl.read(start, start + recs_per_chunk) h5.close() elapsed = time.time() - start_time data_rate = 1.0 * total_size / elapsed print "Read test with compression level %d: %6.1f MB/s" % (complevel, data_rate * 1.0e-6) os.remove('d:/test.h5') With these results: Testing hd5f write, read performance of offset Gaussian noise data for 10000000000 bytes in chunks of 500000 bytes using lzo compression: Write test with compression level 0: 70.0 MB/s HDF5 file compressed by 1:1.000 Read test with compression level 0: 121.3 MB/s Write test with compression level 1: 64.3 MB/s HDF5 file compressed by 1:1.983 Read test with compression level 1: 138.3 MB/s Write test with compression level 2: 65.6 MB/s HDF5 file compressed by 1:1.983 Read test with compression level 2: 139.2 MB/s Write test with compression level 3: 65.8 MB/s HDF5 file compressed by 1:1.983 Read test with compression level 3: 141.3 MB/s Write test with compression level 4: 65.9 MB/s HDF5 file compressed by 1:1.983 Read test with compression level 4: 155.5 MB/s Write test with compression level 5: 65.7 MB/s HDF5 file compressed by 1:1.983 Read test with compression level 5: 140.4 MB/s Write test with compression level 6: 65.9 MB/s HDF5 file compressed by 1:1.983 Read test with compression level 6: 142.7 MB/s Write test with compression level 7: 64.7 MB/s HDF5 file compressed by 1:1.983 Read test with compression level 7: 138.2 MB/s Write test with compression level 8: 65.0 MB/s HDF5 file compressed by 1:1.983 Read test with compression level 8: 136.4 MB/s Write test with compression level 9: 64.6 MB/s HDF5 file compressed by 1:1.983 Read test with compression level 9: 135.8 MB/s I see that for a compression level of about 4, the write speed only goes down from 70 MB/s to 66 MB/s, and the read speed increases from 121 MB/s to 155 MB/s (or 28%). Actually I had hoped for a larger relative increase in the read speed based on what I saw in the Optimization tips chapter. Are there tricks for making it even faster or have I done stupid things in my test code? Not a big issue though as the 155 MB/s is really good enough for my application. Curiously, the compression ratio is independent on the compression level. Cheers, Kim ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users