Hallo all,
I have a question concerning the allocated file space of chunked
datasets. I use a chunked dataset with a fill value and incremental
allocation time as followed:
hsize_tchunk_dims[3]={10,10,10};
const int rank = 3;
H5::DSetCreatPropList cparms;
cparms.setChunk( rank, chunk_dims );
/* Set fill value for the dataset. */
double fill_val = -999.999;
cparms.setFillValue( datatype, &fill_val );
/* Set allocation time. */
cparms.setAllocTime(H5D_ALLOC_TIME_INCR);
/*
* create dataspace with min/max dimensions.
*/
hsize_t min_dims[] = {10000,1000,1000};
hsize_t max_dims[] = {
H5S_UNLIMITED,
H5S_UNLIMITED,
H5S_UNLIMITED
};
H5::DataSpace dataspace( rank, min_dims, max_dims );
....
As I understand, memory is only allocated for chunks where data is
actually written to. In other words, no data is allocated for chunks
that contain only fill values. My question is, is this also true for the
file space on the disk? My observance is, that memory for the whole
dataset (also "empty" chunks) is allocated on the disk. I compared
sparse matrices with full matrices and the allocated memory is nearly
identical. Is there a way to reduce the size of sparse matrices on the
disc? I am thinking of using compression. Is this a common procedure to
achive this, or do you recommend something different?
Thank you in advance,
Jannis
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org