Re: [Hdf-forum] Poor write performance with 30, 000 MPI ranks (pHDF5)

Mark Howison Tue, 08 Mar 2011 06:32:35 -0800

Hi Leigh,

I've actually never tried to use chunking in conjunction with
collective buffering, and it may be that it is interacting poorly with
the CB algorithm in the Cray library. I would also advise trying
contiguous storage like Quincey has suggested, or switching to
independent mode and using chunking with alignment to pad the chunks
out to stripe-width.


Mark

On Mon, Mar 7, 2011 at 10:41 PM, Quincey Koziol <[email protected]> wrote:
> Hi Leigh,
>
> On Mar 5, 2011, at 12:49 PM, Leigh Orf wrote:
>
>> Did another test on kraken, ended up with approximately 270 MB/s
>> performance. This appears to be in line with the "baseline" results of
>> the "Tuning HDF5 for Lustre File Systems" paper. I double checked to
>> verify that I was using version 1.8.5 and have H5FD_MPIO_COLLECTIVE
>> set.
>>
>> For this particular test, I wrote 12 files spanning 30,000 cores
>> simultaneously. I watched the data come in (ls -l every 5 seconds) and
>> noticed the data came in in 'fits and starts' and towards the end of
>> the writes, only a few hundreds of bytes remained to be written, and
>> it took a long time for those bytes to get written. Something is
>> weird.
>>
>> I put some stuff on line if anyone wants to take a glance at it. The
>> output of h5stat and h5ls on one of the files is included, the output
>> of lfs getstripe is included, and a typescript file showing the ls -l
>> output every 5 seconds is included to show how the files grew over
>> time.
>>
>> You can view the files here: http://orf5.com/hdf5/kraken
>
>        Interesting...  So, your datasets are all fixed size, with no filters. 
>  If you are writing the entire dataset in one I/O operation (via collective 
> parallel I/O, or with serial I/O), you should try switching to using 
> contiguous storage for all your datasets.
>
>        Quincey
>
>> I have one question that may be at the root of this performance issue.
>> The Tuning paper talked about how chunks should be aligned. I have
>> chosen my own chunk dimensions, which are the same size as the array
>> dimensions. h5ls -rv shows that those chunk dimensions are preserved
>> (and these chunk dimensions are of course not aligned). Does this mean
>> I am overriding an internal mechanism in hdf5 which chooses its own
>> chunk dimensions based upon the lustre strip size? If I do not write
>> chunked data, will pHDF5 choose chunk dimensions for me which are
>> aligned?
>>
>> Thanks,
>>
>> Leigh
>>
>>
>> --
>> Leigh Orf
>> Associate Professor of Atmospheric Science
>> Department of Geology and Meteorology
>> Central Michigan University
>> Currently on sabbatical at the National Center for Atmospheric
>> Research in Boulder, CO
>> NCAR office phone: (303) 497-8200
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
>

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Re: [Hdf-forum] Poor write performance with 30, 000 MPI ranks (pHDF5)

Reply via email to