On Tue, Sep 03, 2013 at 07:21:16PM +0000, Biddiscombe, John A. wrote: > Rob > > Thanks very much for this info. I've been reading the manuals and getting up > to speed with the system. I've set some benchmarks running for parallel IO > using multiple datasets, compound data types etc etc. > > when you say ... > > > More generally, I've found that some of the default MPI-IO settings are > > probably not ideal for /Q, and have tested/suggested a change to the > > "number of I/O aggregators" defaults. > > Do you mean aggregators inside romio, or gpfs itself.
I'm speaking about the MP-IO (romio) library. For Blue Gene, the code hasn't changed too much since /L. Our /Q has 64x more parallelism per node than /L, so one can imagine the assumptions made in 2004 might need to be updated :> Some of that is simple tuning of defaults. We're also talking with IBM guys about some more substantial ROMIO changes. > I was under the impression that on BGQ machines (which is what I'm > targeting), the IO was shipped to the IO nodes which performed aggregation > anyway. This is what I was referring to when I said "shuffling data twice" - > there's no point in hdf/mpiio performing collective IO as this task was being > done by the OS. Am I to understand that the IO nodes don't natively do a very > good job of it and need some assistance? The I/O nodes on Blue Gene have never been sophisticated. They relay system calls. the end. No re-ordering, no coalescing, no caching (ok, GPFS has a page pool on the io node, but that's GPFS doing the caching, not the I/O node daemon, so I make a distinction). ==rob -- Rob Latham Mathematics and Computer Science Division Argonne National Lab, IL USA _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
