Hi!

The concrete questions are posted below. But first, some details.


I am building a simple 2D table database (where the column data types can
differ) with the high level HDF5 table api (H5TB). The table only has to
support simple operations such as creating/reading/updating/deleting rows
and columns (mainly single row/column based operations).

Since the column data types are not fixed, I am basically manually
constructing the appropriate struct like object (manually computing the
offsets, sizes etc. of the fields) and passing that to the H5TB api.

After testing my implementation I found that some of the single row/column
operations are very slow. I think this is mainly due to two reasons:
1. The H5TB api has to open and close the dataset on every operation
2. It has to construct the record (row) memory data type on every operation

I think the second point severely affects the performance as the number of
columns grows.
For example, for reading single rows:
With 100 columns (compound type with 100 members) I get a speed of ~700
rows/sec.
With 1000 columns, the speed is ~15 rows/sec.
With 10000 columns, I am unable to create the table and get error messages
like: "H5D__update_oh_info(): unable to update datatype header message" and
"H5O_alloc(): object header message is too large".


So, finally, here are my questions:
1. Is my design fundamentally flawed (the H5TB api was not intended for this
purpose?) or am I just doing something wrong?
2. Would I get rid of the performance problems by not to closing the dataset
and not to constructing the record data type on every operation (e.g. write
an optimized version of the H5TB api - something like what the PyTables
library does)?
3. Is there an alternative to constructing a compound data type, so that I
can create tables with a million columns?


Regards,
Reimo Rebane



--
View this message in context: 
http://hdf-forum.184993.n3.nabble.com/Using-H5TB-for-large-tables-tp4025782.html
Sent from the hdf-forum mailing list archive at Nabble.com.

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to