On Tue, Sep 03, 2013 at 07:21:16PM +0000, Biddiscombe, John A. wrote:
> Rob
> 
> Thanks very much for this info. I've been reading the manuals and getting up 
> to speed with the system. I've set some benchmarks running for parallel IO 
> using multiple datasets, compound data types etc etc. 
> 
> when you say ...
> 
> > More generally, I've found that some of the default MPI-IO settings are
> > probably not ideal for /Q, and have tested/suggested a change to the
> > "number of I/O aggregators" defaults.
> 
> Do you mean aggregators inside romio, or gpfs itself. 

I'm speaking about the MP-IO (romio) library.  For Blue Gene, the code
hasn't changed too much since /L.  Our /Q has 64x more parallelism per
node than /L, so one can imagine the assumptions made in 2004 might
need to be updated :>

Some of that is simple tuning of defaults.  We're also talking with
IBM guys about some more substantial ROMIO changes.   

> I was under the impression that on BGQ machines (which is what I'm 
> targeting), the IO was shipped to the IO nodes which performed aggregation 
> anyway. This  is what I was referring to when I said "shuffling data twice" - 
> there's no point in hdf/mpiio performing collective IO as this task was being 
> done by the OS. Am I to understand that the IO nodes don't natively do a very 
> good job of it and need some assistance?

The I/O nodes on Blue Gene have never been sophisticated.  They relay
system calls.  the end.  No re-ordering, no coalescing, no caching
(ok, GPFS has a page pool on the io node, but that's GPFS doing the
caching, not the I/O node daemon, so I make a distinction).

==rob


-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Reply via email to