Hi Jan,
On Mar 9, 2010, at 11:42 AM, Jan Linxweiler wrote:
> Hi Quincey,
>
> thank you for your answer. I tried not setting a fill value, but I think the
> dataset is not valid than. I could not figure out how to identify not valid
> chunks. Also HDFView was not able to read those files. Therefore I suppose it
> is not the common way to use chunked datasets. Isn't it?
When you didn't set the fill value, was the file smaller (because the
chunks weren't allocated)? If that's true, then I think this is a bug in the
HDF5 library and we should fix it so that it doesn't allocate chunks (when the
allocation time is increment or late) when a fill value is defined.
Quincey
> Jan
>
> On 09.03.2010, at 18:20, Quincey Koziol wrote:
>
>> Hi Jan,
>>
>> On Mar 9, 2010, at 5:22 AM, Jan Linxweiler wrote:
>>
>>> It seems like simply enabling compression does not change anything. The
>>> file sizes for sparse and dense matrices still have the same size.
>>>
>>> Can anyone give me a hint on how to work this out?
>>
>> Hmm, I would think that you are correct in your expectations. Can you
>> try without setting the fill value and see what happens?
>>
>> Quincey
>>
>>> On 09.03.2010, at 12:01, Jan Linxweiler wrote:
>>>
>>>> Hallo all,
>>>>
>>>> I have a question concerning the allocated file space of chunked datasets.
>>>> I use a chunked dataset with a fill value and incremental allocation time
>>>> as followed:
>>>>
>>>> hsize_tchunk_dims[3]={10,10,10};
>>>> const int rank = 3;
>>>>
>>>> H5::DSetCreatPropList cparms;
>>>> cparms.setChunk( rank, chunk_dims );
>>>>
>>>> /* Set fill value for the dataset. */
>>>> double fill_val = -999.999;
>>>> cparms.setFillValue( datatype, &fill_val );
>>>>
>>>> /* Set allocation time. */
>>>> cparms.setAllocTime(H5D_ALLOC_TIME_INCR);
>>>>
>>>> /*
>>>> * create dataspace with min/max dimensions.
>>>> */
>>>> hsize_t min_dims[] = {10000,1000,1000};
>>>>
>>>> hsize_t max_dims[] = {
>>>> H5S_UNLIMITED,
>>>> H5S_UNLIMITED,
>>>> H5S_UNLIMITED
>>>> };
>>>>
>>>> H5::DataSpace dataspace( rank, min_dims, max_dims );
>>>>
>>>> ....
>>>>
>>>> As I understand, memory is only allocated for chunks where data is
>>>> actually written to. In other words, no data is allocated for chunks that
>>>> contain only fill values. My question is, is this also true for the file
>>>> space on the disk? My observance is, that memory for the whole dataset
>>>> (also "empty" chunks) is allocated on the disk. I compared sparse matrices
>>>> with full matrices and the allocated memory is nearly identical. Is there
>>>> a way to reduce the size of sparse matrices on the disc? I am thinking of
>>>> using compression. Is this a common procedure to achive this, or do you
>>>> recommend something different?
>>>>
>>>> Thank you in advance,
>>>>
>>>> Jannis
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Hdf-forum is for HDF software users discussion.
>>>> [email protected]
>>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>>
>>>
>>> _______________________________________________
>>> Hdf-forum is for HDF software users discussion.
>>> [email protected]
>>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>>
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org