Hi Mohamad, num_waveforms is around 2000 (defined on say 384 processors means 785,000 waveform definitions on all processers). We would like to get this code to scale for a num_waveforms of 300,000.
I will give you suggestions a shot. Thanks! James > On Dec 7, 2015, at 3:26 PM, Mohamad Chaarawi <chaar...@hdfgroup.org> wrote: > > Hi James, > > How many datasets are you creating in total (i.e. what is num_waveforms)? > See this message from Elena last week on how to resolve performance problem > when creating large number of objects: > > " > Try to use H5Pset_libver_bounds function (see > https://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetLibverBounds) > using H5F_LIBVER_LATEST for the second and third arguments to set up a file > access property list and then use the access property list when opening > existing file or creating a new one. > > here is a C code snippet: > > fapl_id = H5Pcreate (H5P_FILE_ACCESS); > H5Pset_libver_bounds (fapl_id, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST); > file_id = H5Fcreate(filename, H5F_ACC_TRUNC, H5P_DEFAULT, fapl_d); > > By default, the HDF5 library uses the earliest version of the file format > when creating groups. The indexing structure used for that version has a know > deficiency when working with a big number (>50K) of objects in a group. The > issue was addressed in HDF5 1.8, but requires an applications to "turn on" > the latest file format. > > Implications of the latest file format on the performance are not well > documented. The HDF Group is aware of the issue and will be addressing it for > the upcoming releases. > " > > I do suspect that there is another issue here too. You might be triggering > evictions from the metadata cache due to the number of datasets creates and > the evictions are causing bad I/O performance. Could you try running your > program with this HDF5 branch and see if you get any improvement: > https://svn.hdfgroup.org/hdf5/features/phdf5_metadata_opt/ > (This is a development branch based of HDF5 trunk and requires recent > versions of autotools to run ./autogen.sh before being able to configure). > If you have trouble building this, ping me off-list and we can work things > out. > > Thanks, > Mohamad > > > -----Original Message----- > From: Hdf-forum [mailto:hdf-forum-boun...@lists.hdfgroup.org] On Behalf Of > James A. Smith > Sent: Friday, December 04, 2015 4:09 PM > To: hdf-forum@lists.hdfgroup.org > Subject: [Hdf-forum] Improving performance of collective object definitions > in phdf5 > > Hi, > > I have a serious performance issue using phdf5 to writing lot of 1D float > array data on clusters when the number of processers exceeds about 96. > > I profiled the code and it shows that most of the MPI time is spent on > H5Dcreate. > > The writing (independent) is pretty quick. Are there any ways to speed up > performance of the collective object definition? > > Ideally ones that don't involve tailoring settings to a specific cluster. > > Here is the function that is slow to finish (and often hangs due to exceeding > memory?) on more than ~96 processers: > > herr_t ASDF_define_waveforms(hid_t loc_id, int num_waveforms, int nsamples, > long long int start_time, double sampling_rate, > char *event_name, char **waveform_names, > int *data_id) { > int i; > char char_sampling_rate[10]; > char char_start_time[10]; > > // converts to decimal base. > snprintf(char_start_time, sizeof(char_start_time), "%lld", start_time); > snprintf(char_sampling_rate, > sizeof(char_sampling_rate), "%1.7f", sampling_rate); > > for (i = 0; i < num_waveforms; ++i) { > //CHK_H5(groups[i] = H5Gcreate(loc_id, waveform_names[i], > // H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT)); > > hid_t space_id, dcpl; > hsize_t dims[1] = {nsamples}; // Length of waveform > hsize_t maxdims[1] = {H5S_UNLIMITED}; > > CHK_H5(space_id= H5Screate_simple(1, dims, maxdims)); > CHK_H5(dcpl = H5Pcreate(H5P_DATASET_CREATE)); > CHK_H5(H5Pset_chunk(dcpl, 1, dims)); > > CHK_H5(data_id[i] = H5Dcreate(loc_id, waveform_names[i], H5T_IEEE_F32LE, > space_id, > H5P_DEFAULT, dcpl, H5P_DEFAULT)); > > CHK_H5(ASDF_write_string_attribute(data_id[i], "event_id", > event_name)); > CHK_H5(ASDF_write_double_attribute(data_id[i], "sampling_rate", > sampling_rate)); > CHK_H5(ASDF_write_integer_attribute(data_id[i], "starttime", > start_time)); > > CHK_H5(H5Pclose(dcpl)); > CHK_H5(H5Sclose(space_id)); > } > return 0; // Success > } > > It is run in Fortran code in 3 do loops like this: > > do k = 1 mysize > do j = 1, num_stations_rank(k) > do i = 1, 3 > call ASDF_define_waveforms(...) > enddo > enddo > enddo > > So when mysize >96 this is a pretty large number of calls. Any help is > appreciated. > > Thanks, > James > > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > Hdf-forum@lists.hdfgroup.org > http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org > Twitter: https://twitter.com/hdf5 > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > Hdf-forum@lists.hdfgroup.org > http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org > Twitter: https://twitter.com/hdf5 _______________________________________________ Hdf-forum is for HDF software users discussion. Hdf-forum@lists.hdfgroup.org http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5