Re: [Hdf-forum] Very poor performance of pHDF5 when using single (shared) file

Matthieu Brucher Mon, 02 Sep 2013 07:47:08 -0700

Hi,

With a strip count of 1, all your access to a single file will be done
through one OSS. Contrary to the multiple-file case, you won't use the
whole system bandwidth. This means that the poor performance is to be
expected.
>From what I gather, you should write on your FS with a chunk of
"stripe size" aligned on "stripe size" from "stripe count" processes
to have the maximum performance.


Cheers,

Matthieu

2013/9/2 Daniel Langr <[email protected]>:
> Hi Albert and Mohamad,
>
> haven't received e-mails with your replies :( so I cannot reply to them
> specifically (or do not know how). So, replying to my original...
>
> @Albert:
>
> All the used clusters uses the Lustre file system, and I believe, this file
> system should be scalable, at least to some extent. Apparently, it is
> scalable for the single-file-per-process strategy.
>
> I understand the notice about memory-to-kernel writes. However, again, I am
> comparing the single/multiple-file strategy. Both give quite different
> results. Moreover, the multiple-file case correspond, within my
> measurements, to the listed peak I/O bandwidth of the storage systems. The
> single-file case is much much worse. Obviously, this is not limited by the
> memory-kernel copying.
>
> Thanks for the link to hd5perf, I will try it.
>
> @Mohamad:
>
> Thanks for hint, all file systems are Lustre-based, indeed with default
> strip count 1. I will rerun my measurements with different strip size/count
> and post the results.
>
> Daniel
>
>
>
> Dne 30. 8. 2013 16:05, Daniel Langr napsal(a):
>
>> I've run some benchmark, where within an MPI program, each process wrote
>> 3 plain 1D arrays to 3 datasets of an HDF5 file. I've used the following
>> writing strategies:
>>
>> 1) each process writes to its own file,
>> 2) each process writes to the same file to its own dataset,
>> 3) each process writes to the same file to a same dataset.
>>
>> I've tested 1)-3) for both fixed/chunked datasets (chunk size 1024), and
>> I've tested 2)-3) for both independent/collective options of the MPI
>> driver. I've also used 3 different clusters for measurements (all quite
>> modern).
>>
>> As a result, the running (storage) times of the same-file strategy, i.e.
>> 2) and 3), were of orders of magnitudes longer than the running times of
>> the separate-files strategy. For illustration:
>>
>> cluster #1, 512 MPI processes, each process stores 100 MB of data, fixed
>> data sets:
>>
>> 1) separate files: 2.73 [s]
>> 2) single file, independent calls, separate data sets: 88.54[s]
>>
>> cluster #2, 256 MPI processes, each process stores 100 MB of data,
>> chunked data sets (chunk size 1024):
>>
>> 1) separate files: 10.40 [s]
>> 2) single file, independent calls, shared data sets: 295 [s]
>> 3) single file, collective calls, shared data sets: 3275 [s]
>>
>> Any idea why the single-file strategy gives so poor writing performance?
>>
>> Daniel
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org



-- 
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher
Music band: http://liliejay.com/

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Re: [Hdf-forum] Very poor performance of pHDF5 when using single (shared) file

Reply via email to