Re: [Hdf-forum] About Parallel Hdf5 Reading for a 3D dataset

Rob Latham Mon, 12 Sep 2011 07:56:18 -0700

On Fri, Sep 09, 2011 at 04:31:39PM -0500, Yongsheng Pan wrote:
> Hello, dear hdf experts,


Howdy, fellow ANL-er

> I am using the parallel hdf5 library to access a three-dimensional dataset
> in a hdf5 file on linux. The dataset is of size: dim[0]=1501, dim[1] = 1536 
> and 
> dim[2] = 2048. I want to read the dataset along the first (READ_Z), 
> second(READ_Y), 
> and third (READ_X) dimension separately. 
> 
> However, the performance is quite differently. 

Sure.  In one order, the bytes are all laid out nice and contiguous.
A reader can just zip through.  In another order, the access is
non-contiguous, requiring collecting more pieces and parts from across
the dataset.
 
> I am using mpich2 and Parallel hdf5 1.8.5 on linux. The compiler is mpicc 
> from mpich2. 

Are you perhaps using argonne's Fusion cluster?  I don't know anything
about the APS compute resources, so I can't offer any tuning
suggestions there.

I don't think you should turn off collectives.  

I do think you should think if there's some way your application in
the aggregate could read the entire dataset in a single call.  What i
mean is, a given process can be decomposed in any way you like, but if
you think of each processes decomposition as a puzzle piece, all the
puzzle pieces would fit together to be the full 3d array.   Then, your
MPI-IO library can work some magic.

==rob

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Re: [Hdf-forum] About Parallel Hdf5 Reading for a 3D dataset

Reply via email to