Re: [Hdf-forum] Performance issues, any suggestions on what I can do?

Werner Benger Wed, 11 Jun 2014 01:05:25 -0700

So each time series is a dataset in the file's root directory, thus youend up with a million datasets in the same root group?

Does performance also slow down when you use the HDF5 tools, such ash5ls / h5dump on such a file?

It might make sense to rearrange your datasets hierarchically such thatyou have only e.g. 1000 datasets per group, and you create 1000 groups,each covering a range of time series, thus getting a million datasetsbut only 1000 per group.

If all time series are of the same length, it might also be an option toput them all into the same dataset but leave one dimension open andextend that one as new datasets come in, so each time series is just onechunk of such a 2-dimensional dataset.


Cheers,
           Werner



On 11.06.2014 00:40, Tony Kennedy - Ventana Systems UK wrote:

I've got to store anything from ten to possibly millions of timeseries. At the moment, I create a simple dataspace and store the timeseries which all works. But once I get to a few thousand time series,performance drops off so HDF5 is no longer an option for me.
Can anyone suggest anything I can try? The function to write the datato the HDF5 is below, it's pretty simple.
Any suggestions at all are more than welcome.

All the best,

Tony.
int AddTimeSeriesToHDF5File(int file_id, charutf8* pStrVarName, double*pVars, int nVals, BOOL bCompress)
{
    hsize_t            dims[1];
    hid_t            dataspace_id;
    hid_t            dcpl, aid2 ;
    hid_t            dataset_id;
    hid_t            attr2;
    herr_t            status;
    hid_t            plist_id;        //compress
    hsize_t            cdims[1];

    //bCompress = FALSE;
    bCompress = TRUE;

    dims[0] = nVals;
    dataspace_id = H5Screate_simple(1, dims, NULL);

    if ( bCompress )
    {
        plist_id  = H5Pcreate (H5P_DATASET_CREATE);
        cdims[0] = nVals;
        status = H5Pset_chunk (plist_id , 1, cdims);

        status = H5Pset_deflate (plist_id , 9);
    }
    else
    {
        plist_id = H5P_DEFAULT;
    }

    //Compact dataset test start
    //dcpl = H5Pcreate (H5P_DATASET_CREATE);
    //status = H5Pset_layout (dcpl, H5D_COMPACT);
    //Compact dataset test end

    // Open an existing dataset.
dataset_id = H5Dcreate2(file_id, pStrVarName, H5T_NATIVE_DOUBLE,dataspace_id, H5P_DEFAULT, plist_id , H5P_DEFAULT);
    //Attribute set start
    aid2  = H5Screate(H5S_SCALAR);
attr2 = H5Acreate2(dataset_id, "Number of points", H5T_NATIVE_INT,aid2, H5P_DEFAULT, H5P_DEFAULT);
    status = H5Awrite(attr2, H5T_NATIVE_INT, &nVals);
    //Attribute set end

    //Now try writing data
status = H5Dwrite(dataset_id, H5T_NATIVE_DOUBLE, H5S_ALL, H5S_ALL,H5P_DEFAULT, pVars);
    status = H5Sclose(dataspace_id);
    status = H5Dclose(dataset_id);
    status = H5Pclose(plist_id);

    return status;
}


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5


--
___________________________________________________________________________
Dr. Werner Benger                Visualization Research
Center for Computation & Technology at Louisiana State University (CCT/LSU)
2019  Digital Media Center, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809                        Fax.: +1 225 578-5362


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Re: [Hdf-forum] Performance issues, any suggestions on what I can do?

Reply via email to