Hi, Chris

Sorry for the delayed response. After much effort, the Open MPI 1.7 branch now 
supports PMI2 (in general, not just for ALPS) and has been tested and evaluated 
at small-ish scale (up to 512 ranks) with SLURM 2.6. We need to test this at 
larger scale and plan to do so in the coming weeks, but what we have observed 
thus far is the following:

1. KVS Fence operation appears to scale worse than linear. This issue resides 
solely on the SLURM side. Perhaps a better algorithm could be implemented - we 
have discussed recursive doubling and Bruck's as alternatives.  

2. There are still O(N) calls to PMI2_get at the OMPI/ORTE level that don't 
appear to scale particularly well. Circumventing this remains an open 
challenge, though proposals have been tossed around such as having a single 
node leader get all the data from KVS space, put it into a shared segment where 
the other ranks on host can read from. Unfortunately, this is still O(N), just 
with a reduced coefficient. 

3. We observed launch times take longer with SLURM 2.6 than they did with the 
2.5.X series. However, anecdotally, scaling appears to be improved. From our 
(Mellanox's) point of view, getting something that doesn't "blow-up" 
quadratically as N goes to 4K ranks and beyond is more important than the 
absolute performance in launching any one job size.

>From the data that I have seen, it appears that simply switching to SLURM 2.6 
>(along with the latest OMPI 1.7) will most likely not provide comparable 
>performance to launching with mpirun. I'll be sure to keep you and the 
>community appraised of the situation as more data on larger systems becomes 
>available in the coming weeks. 


Best regards,

Josh


Joshua S. Ladd, PhD
HPC Algorithms Engineer
Mellanox Technologies 

Email: josh...@mellanox.com
Cell: +1 (865) 258 - 8898


     

-----Original Message-----
From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Christopher Samuel
Sent: Thursday, August 08, 2013 12:26 AM
To: de...@open-mpi.org
Subject: Re: [OMPI devel] Open-MPI build of NAMD launched from srun over 20% 
slowed than with mpirun

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Joshua,

On 23/07/13 19:34, Joshua Ladd wrote:

> The proposed solution that "we" (OMPI + SLURM) have come up with is to 
> modify OMPI to support PMI2 and to use SLURM 2.6 which has support for 
> PMI2 and is (allegedly) much more scalable than PMI1.
> Several folks in the combined communities are working hard, as we 
> speak, trying to get this functional to see if it indeed makes a 
> difference. Stay tuned, Chris. Hopefully we will have some data by the 
> end of the week.

Is there any news on this?

We'd love to be able to test this out if we can as I currently see a 60% 
penalty with srun with my test NAMD job from our tame MM person.

thanks!
Chris
- -- 
 Christopher Samuel        Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/      http://twitter.com/vlsci

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlIDHbQACgkQO2KABBYQAh8vjgCgjPFB354t8dldPEA3pw2IHHze
vB4Ani5vfK+9+BkbRF92FGhtB4eyIF1u
=KoTt
-----END PGP SIGNATURE-----
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to