[Pytables-users] File bloat with row.update()

David E. Sallis Wed, 22 Sep 2010 13:05:19 -0700

I have a table in an HDF5 file consisting of 9 columns and just over 6000 rows,
and an application which performs updates on these table rows.  The application
runs hourly and performs updates to the table during each run.  No new table
rows are added during a run.  I perform updates to the table by using
row.update() inside a table.where() iterator loop.


I have noticed that after each application run the size of the file increases
significantly, and over time the file size balloons from just over 21 MB to well
over 750 MB, with no new data being added, just updated.

h5repack() run on this file will restore it to its original size with no loss of
data.

My questions are:

1) What causes the file size increase and
2) is there anything I can do to prevent it?

I am using PyTables 2.1.1, HDF5 1.8.3, Python 2.6 under Linux RedHat 5.

-- 
David E. Sallis, Senior Principal Engineer, Software
General Dynamics Information Technology
NOAA Coastal Data Development Center
Stennis Space Center, Mississippi
228.688.3805
[email protected]
[email protected]
--------------------------------------------
"Better Living Through Software Engineering"
--------------------------------------------

------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
_______________________________________________
Pytables-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pytables-users

[Pytables-users] File bloat with row.update()

Reply via email to