Hallo all,

I have a question concerning the allocated file space of chunked datasets. I use a chunked dataset with a fill value and incremental allocation time as followed:

   hsize_tchunk_dims[3]={10,10,10};
   const int rank = 3;

   H5::DSetCreatPropList cparms;
   cparms.setChunk( rank, chunk_dims );

   /* Set fill value for the dataset. */
   double fill_val = -999.999;
   cparms.setFillValue( datatype, &fill_val );

   /* Set allocation time. */
   cparms.setAllocTime(H5D_ALLOC_TIME_INCR);

   /*
   * create dataspace with min/max dimensions.
   */
   hsize_t min_dims[] = {10000,1000,1000};

   hsize_t max_dims[] = {
      H5S_UNLIMITED,
      H5S_UNLIMITED,
      H5S_UNLIMITED
   };

   H5::DataSpace dataspace( rank, min_dims, max_dims );

   ....

As I understand, memory is only allocated for chunks where data is actually written to. In other words, no data is allocated for chunks that contain only fill values. My question is, is this also true for the file space on the disk? My observance is, that memory for the whole dataset (also "empty" chunks) is allocated on the disk. I compared sparse matrices with full matrices and the allocated memory is nearly identical. Is there a way to reduce the size of sparse matrices on the disc? I am thinking of using compression. Is this a common procedure to achive this, or do you recommend something different?

Thank you in advance,

Jannis



_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to