On Jul 23, 2013, at 9:59 AM, Tim Wickberg <wickb...@gwu.edu> wrote: > I'm assuming the jobs are running across multiple nodes, using MPI for > communication? > > I'm guessing that srun is resulting in communication going across a GigE > fabric rather than IB, where mpirun directly is using the IB. A ~20% > performance penalty would make sense in that context.
This would happen only if someone specified that mpirun use the ip-over-ib interface, which users typically don't do in order to maintain separation between the out-of-band and MPI traffic > > - Tim > > -- > Tim Wickberg > wickb...@gwu.edu > Senior HPC Systems Administrator > The George Washington University > > > On Tue, Jul 23, 2013 at 3:06 AM, Christopher Samuel <sam...@unimelb.edu.au> > wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi there slurm-dev and OMPI devel lists, > > Bringing up a new IBM SandyBridge cluster I'm running a NAMD test case > and noticed that if I run it with srun rather than mpirun it goes over > 20% slower. These are all launched from an sbatch script too. > > Slurm 2.6.0, RHEL 6.4 (latest kernel), FDR IB. > > Here are some timings as reported as the WallClock time by NAMD itself > (so not including startup/tear down overhead from Slurm). > > srun: > > run1/slurm-93744.out:WallClock: 695.079773 CPUTime: 695.079773 > run4/slurm-94011.out:WallClock: 723.907959 CPUTime: 723.907959 > run5/slurm-94013.out:WallClock: 726.156799 CPUTime: 726.156799 > run6/slurm-94017.out:WallClock: 724.828918 CPUTime: 724.828918 > > Average of 692 seconds > > mpirun: > > run2/slurm-93746.out:WallClock: 559.311035 CPUTime: 559.311035 > run3/slurm-93910.out:WallClock: 544.116333 CPUTime: 544.116333 > run7/slurm-94019.out:WallClock: 586.072693 CPUTime: 586.072693 > > Average of 563 seconds. > > So that's about 23% slower. > > Everything is identical (they're all symlinks to the same golden > master) *except* for the srun / mpirun which is modified by copying > the batch script and substituting mpirun for srun. > > When they are running I can see that for jobs launched with srun they > are direct children of slurmstepd whereas when started with mpirun > they are children of Open-MPI's orted (or mpirun on the launch node) > which itself is a child of slurmstepd. > > Has anyone else seen anything like this, or got any ideas? > > cheers, > Chris > - -- > Christopher Samuel Senior Systems Administrator > VLSCI - Victorian Life Sciences Computation Initiative > Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 > http://www.vlsci.org.au/ http://twitter.com/vlsci > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iEYEARECAAYFAlHuKxoACgkQO2KABBYQAh8cYQCfT/YIFkyeDaNb/ksT2xk4W416 > kycAoJfdZInLwy+nTIL7CzWapZZU20qm > =ZJ1B > -----END PGP SIGNATURE----- > >