On Jul 09, 2007 13:53 -0600, Adam Boggs wrote: > We've been testing lustre 1.6.0.1 with MPI-IO (using the mpi-io-test > benchmark that comes with pvfs2 and NCAR's POP-IO test) on our BlueGene > and have seen very poor performance... ~10 MB/s. IOR shows similar > results writing to the same file. Telling IOR to run on different files > shows great scalability, but our MPI-IO apps don't work that way. We've > also seen similar MPI-IO performance at livermore running with lustre > 1.4 on a regular linux cluster (10-20 MB/s). We don't see this with the > same tests running on our gpfs or pvfs2 file systems. > > I haven't tracked this down too far yet, so if anyone has suggestions of > things to check, or similar/different experiences, I'd love to hear about > it. The IO sizes seem reasonably large (~1MB) and there don't appear to > be any client evictions. I did try the -o localflock mount option patch > for 1.6 since I know MPI-IO flocks regions, but haven't had a chance to > fully benchmark that yet.
Hi Adam, can you post more specifics of your benchmark? In our testing at LLNL we didn't see any such problems with IOR, though it is possible that the IOR parameters are "too well behaved" or something. The one critical issue is that for shared-file tests you need to set the striping on the file (or parent directory) to be across all OSTs. We are working to integrate this properly with MPIIO so that it creates the output files with the wide striping for shared-file output automatically. For example, some results from overnight testing (32 OSTs, Elan3 + GigE): POSIX, file-per-process: ---------------------------------------------------------------------- tasks stripe xfer bytes/ rates (MB/s) sample (CPUs) ct size size task write(%dev)(%opt) read(%dev)(%opt) count ---------------------------------------------------------------------- 2(2) 2 1M 2M 2G 146( 3)(20) 248(30)(34) 4 32(2) 2 1M 2M 2G 1656( 1)(14) 1934( 1)(17) 4 256(2) 2 1M 2M 2G 3075( 1)(25) 2104( 3)(17) 4 770(2) 2 1M 2M 2G 2991( 0)(24) 2031( 0)(16) 1 POSIX, single-shared-file: ---------------------------------------------------------------------- tasks stripe xfer bytes/ rates (MB/s) sample (CPUs) ct size size task write(%dev)(%opt) read(%dev)(%opt) count ---------------------------------------------------------------------- 2(2) 32 1M 2M 2G 93( 0)( 2) 148( 2)( 3) 4 32(2) 32 1M 2M 2G 1260( 1)(22) 1951( 4)(34) 3 256(2) 32 1M 2M 2G 3024( 1)(53) 2035( 0)(35) 3 792(2) 32 1M 2M 2G 3026( 0)(53) 1925( 0)(33) 1 MPIIO, file-per-process: ---------------------------------------------------------------------- tasks stripe xfer bytes/ rates (MB/s) sample (CPUs) ct size size task write(%dev)(%opt) read(%dev)(%opt) count ---------------------------------------------------------------------- 2(2) 2 1M 2M 2G 147( 1)(20) 227(27)(32) 4 32(2) 2 1M 2M 2G 1616( 2)(14) 1923( 2)(17) 4 256(2) 2 1M 2M 2G 2695( 6)(22) 1982( 1)(16) 4 798(2) 2 1M 2M 2G 2881( 0)(23) 2078( 0)(17) 1 MPIIO, single-shared-file: ---------------------------------------------------------------------- tasks stripe xfer bytes/ rates (MB/s) sample (CPUs) ct size size task write(%dev)(%opt) read(%dev)(%opt) count ---------------------------------------------------------------------- 2(2) 32 1M 2M 2G 82( 1)( 1) 201(29)( 3) 4 32(2) 32 1M 2M 2G 1331( 1)(23) 1990( 3)(35) 4 256(2) 32 1M 2M 2G 2980( 1)(52) 2006( 2)(35) 4 792(2) 32 1M 2M 2G 3053( 0)(53) 2016( 0)(35) 1 Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. _______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
