Francesc Alted said the following on 11/9/2010 12:42 PM:
> After having a look at you script, yes, I think this is the expected 
> behaviour.  In order to explain this you need to know how HDF5 stores 
> its data internally.  For chunked datasets (the Table object is an 
> example of this), the I/O is done in terms of complete chunks.  Each 
> chunk is then passed to the filters (if any) for compression (or other 
> operations).
> 
> In this case, when you are creating the table and using compression, the 
> chunks are compressed very well, and take very little space on disk.  
> But, when you are *updating* the existing data, you are introducing more 
> entropy and compression does not work as efficiently.  As a consequence, 
> the resulting chunks are larger than the original ones on-disk, and 
> hence they need to be saved in other place (normally at the end of the 
> file).  HDF5 cannot presently remove (nor reuse) the old chunks in an 
> easy way, and have to book new space for such a resulting chunks.  The 
> only way to make the space taken by 'old' chunks is to 'repack' the HDF5 
> file (as you have already noticed).

While not *precisely* the answer I wanted to hear ;-), this makes sense in
retrospect.  Therefore I shall adjust my code accordingly.  Thank you very much
for your time and rapid response.

--David

-- 
David E. Sallis, Senior Principal Engineer, Software
General Dynamics Information Technology
NOAA Coastal Data Development Center
Stennis Space Center, Mississippi
228.688.3805
david.sal...@gdit.com
david.sal...@noaa.gov
--------------------------------------------
"Better Living Through Software Engineering"
--------------------------------------------

------------------------------------------------------------------------------
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to