On Jul 23, 2013, at 9:59 AM, Tim Wickberg <wickb...@gwu.edu> wrote:

> I'm assuming the jobs are running across multiple nodes, using MPI for 
> communication?
> 
> I'm guessing that srun is resulting in communication going across a GigE 
> fabric rather than IB, where mpirun directly is using the IB. A ~20% 
> performance penalty would make sense in that context.

This would happen only if someone specified that mpirun use the ip-over-ib 
interface, which users typically don't do in order to maintain separation 
between the out-of-band and MPI traffic


> 
> - Tim
> 
> --
> Tim Wickberg
> wickb...@gwu.edu
> Senior HPC Systems Administrator
> The George Washington University
> 
> 
> On Tue, Jul 23, 2013 at 3:06 AM, Christopher Samuel <sam...@unimelb.edu.au> 
> wrote:
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi there slurm-dev and OMPI devel lists,
> 
> Bringing up a new IBM SandyBridge cluster I'm running a NAMD test case
> and noticed that if I run it with srun rather than mpirun it goes over
> 20% slower.  These are all launched from an sbatch script too.
> 
> Slurm 2.6.0, RHEL 6.4 (latest kernel), FDR IB.
> 
> Here are some timings as reported as the WallClock time by NAMD itself
> (so not including startup/tear down overhead from Slurm).
> 
> srun:
> 
> run1/slurm-93744.out:WallClock: 695.079773  CPUTime: 695.079773
> run4/slurm-94011.out:WallClock: 723.907959  CPUTime: 723.907959
> run5/slurm-94013.out:WallClock: 726.156799  CPUTime: 726.156799
> run6/slurm-94017.out:WallClock: 724.828918  CPUTime: 724.828918
> 
> Average of 692 seconds
> 
> mpirun:
> 
> run2/slurm-93746.out:WallClock: 559.311035  CPUTime: 559.311035
> run3/slurm-93910.out:WallClock: 544.116333  CPUTime: 544.116333
> run7/slurm-94019.out:WallClock: 586.072693  CPUTime: 586.072693
> 
> Average of 563 seconds.
> 
> So that's about 23% slower.
> 
> Everything is identical (they're all symlinks to the same golden
> master) *except* for the srun / mpirun which is modified by copying
> the batch script and substituting mpirun for srun.
> 
> When they are running I can see that for jobs launched with srun they
> are direct children of slurmstepd whereas when started with mpirun
> they are children of Open-MPI's orted (or mpirun on the launch node)
> which itself is a child of slurmstepd.
> 
> Has anyone else seen anything like this, or got any ideas?
> 
> cheers,
> Chris
> - --
>  Christopher Samuel        Senior Systems Administrator
>  VLSCI - Victorian Life Sciences Computation Initiative
>  Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
>  http://www.vlsci.org.au/      http://twitter.com/vlsci
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> 
> iEYEARECAAYFAlHuKxoACgkQO2KABBYQAh8cYQCfT/YIFkyeDaNb/ksT2xk4W416
> kycAoJfdZInLwy+nTIL7CzWapZZU20qm
> =ZJ1B
> -----END PGP SIGNATURE-----
> 
> 

Reply via email to