Hi Andreas,

Il giorno 25/mag/2013, alle ore 17:06, Andreas Hilboll <li...@hilboll.de> ha 

> Am 25.05.2013 14:27, schrieb Andreas Hilboll:
>> Hi,
>> the netcdf4-python project
>> (http://netcdf4-python.googlecode.com/svn/trunk/docs/netCDF4.Dataset-class.html#createVariable)
>> supports a "least_significant_digit" attribute when creating a
>> variable/array. This leads to a truncation of the array data before
>> storing it to disk
>> (https://code.google.com/p/netcdf4-python/source/browse/trunk/netCDF4_utils.py#26),
>> which leads to be zlib compression more effective.
>> My question: Is the same true when I compress the array data with blosc?
>> Will I get significant compression improvements when truncating my data
>> before storing it in pytables?
> Actually, I can now answer my own question: Yes, it does save some
> space. As test, I created a file with two 5760x2880x12 arrays of dtype
> float32. The data values are all in the range between +-1E17. When I
> truncate the input values to 1E11 (least_significant_digit=-11), when I
> get about 20% space reduction:
> -rw-r--r-- 1 andreas andreas 418M Mai 25 16:47 satdb_blosc9-11.h5
> -rw-r--r-- 1 andreas andreas 578M Mai 25 16:34 satdb_blosc9.h5
> Would you guys be interested in having this as an optional filter? If
> so, I'd be happy to submit a PR for this.
> -- Andreas.

thanks Andreas, it would be a nice addition to PyTables.

In PyTable 3.0 (currently we have rc2 out) we introduced support for the 
float16 data type.
It is not as flexible as the solution you are suggesting but IMO it could help 
in your case.

best regards

Antonio Valentino

Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
Pytables-users mailing list

Reply via email to