On Jul 09, 2007  13:53 -0600, Adam Boggs wrote:
> We've been testing lustre 1.6.0.1 with MPI-IO (using the mpi-io-test
> benchmark that comes with pvfs2 and NCAR's POP-IO test) on our BlueGene
> and have seen very poor performance...  ~10 MB/s.  IOR shows similar
> results writing to the same file.  Telling IOR to run on different files
> shows great scalability, but our MPI-IO apps don't work that way.  We've
> also seen similar MPI-IO performance at livermore running with lustre
> 1.4 on a regular linux cluster (10-20 MB/s).  We don't see this with the
> same tests running on our gpfs or pvfs2 file systems.
> 
> I haven't tracked this down too far yet, so if anyone has suggestions of
> things to check, or similar/different experiences, I'd love to hear about
> it.  The IO sizes seem reasonably large (~1MB) and there don't appear to
> be any client evictions.  I did try the -o localflock mount option patch
> for 1.6 since I know MPI-IO flocks regions, but haven't had a chance to
> fully benchmark that yet.

Hi Adam, can you post more specifics of your benchmark?  In our testing
at LLNL we didn't see any such problems with IOR, though it is possible
that the IOR parameters are "too well behaved" or something.  The one
critical issue is that for shared-file tests you need to set the striping
on the file (or parent directory) to be across all OSTs.  We are working
to integrate this properly with MPIIO so that it creates the output files
with the wide striping for shared-file output automatically.

For example, some results from overnight testing (32 OSTs, Elan3 + GigE):

POSIX, file-per-process:
----------------------------------------------------------------------
 tasks  stripe    xfer bytes/              rates (MB/s)          sample
 (CPUs) ct  size  size task   write(%dev)(%opt) read(%dev)(%opt) count
----------------------------------------------------------------------
   2(2)  2    1M    2M    2G     146( 3)(20)      248(30)(34)      4
  32(2)  2    1M    2M    2G    1656( 1)(14)     1934( 1)(17)      4
 256(2)  2    1M    2M    2G    3075( 1)(25)     2104( 3)(17)      4
 770(2)  2    1M    2M    2G    2991( 0)(24)     2031( 0)(16)      1

POSIX, single-shared-file:
----------------------------------------------------------------------
 tasks  stripe    xfer bytes/              rates (MB/s)          sample
 (CPUs) ct  size  size task   write(%dev)(%opt) read(%dev)(%opt) count
----------------------------------------------------------------------
   2(2) 32    1M    2M    2G      93( 0)( 2)      148( 2)( 3)      4
  32(2) 32    1M    2M    2G    1260( 1)(22)     1951( 4)(34)      3
 256(2) 32    1M    2M    2G    3024( 1)(53)     2035( 0)(35)      3
 792(2) 32    1M    2M    2G    3026( 0)(53)     1925( 0)(33)      1

MPIIO, file-per-process:
----------------------------------------------------------------------
 tasks  stripe    xfer bytes/              rates (MB/s)          sample
 (CPUs) ct  size  size task   write(%dev)(%opt) read(%dev)(%opt) count
----------------------------------------------------------------------
   2(2)  2    1M    2M    2G     147( 1)(20)      227(27)(32)      4
  32(2)  2    1M    2M    2G    1616( 2)(14)     1923( 2)(17)      4
 256(2)  2    1M    2M    2G    2695( 6)(22)     1982( 1)(16)      4
 798(2)  2    1M    2M    2G    2881( 0)(23)     2078( 0)(17)      1

MPIIO, single-shared-file:
----------------------------------------------------------------------
 tasks  stripe    xfer bytes/              rates (MB/s)          sample
 (CPUs) ct  size  size task   write(%dev)(%opt) read(%dev)(%opt) count
----------------------------------------------------------------------
   2(2) 32    1M    2M    2G      82( 1)( 1)      201(29)( 3)      4
  32(2) 32    1M    2M    2G    1331( 1)(23)     1990( 3)(35)      4
 256(2) 32    1M    2M    2G    2980( 1)(52)     2006( 2)(35)      4
 792(2) 32    1M    2M    2G    3053( 0)(53)     2016( 0)(35)      1


Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to