-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 08/05/14 12:54, Ralph Castain wrote:
> I think there was one 2.6.x that was borked, and definitely > problems in the 14.03.x line. Can't pinpoint it for you, though. No worries, thanks. > Sounds good. I'm going to have to dig deeper into those numbers, > though, as they don't entirely add up to me. Once the job gets > launched, the launch method itself should have no bearing on > computational speed - IF all things are equal. In other words, if > the process layout is the same, and the binding pattern is the > same, then computational speed should be roughly equivalent > regardless of how the procs were started. Not sure if it's significant but when mpirun was launching processes it was using srun to start orted which then started MPI ranks whereas with PMI/PMI2 it appeared to directly start the ranks. > My guess is that your data might indicate a difference in the > layout and/or binding pattern as opposed to PMI2 vs mpirun. At the > scale you mention later in the thread (only 70 nodes x 16 ppn), the > difference in launch timing would be zilch. So I'm betting you > would find (upon further exploration) that (a) you might not have > been binding processes when launching by mpirun, since we didn't > bind by default until the 1.8 series, but were binding under direct > srun launch, and (b) your process mapping would quite likely be > different as we default to byslot mapping, and I believe srun > defaults to bynode? FWIW all our environment modules that do OMPI have: setenv OMPI_MCA_orte_process_binding core > Might be worth another comparison run when someone has time. Yeah, I'll try and queue up some more tests - unfortunately the cluster we tested on then is flat out at the moment but I'll try and sneak a 64-core job using identical configs and compare mpirun, srun on its own and srun with PMI2. All the best, Chris - -- Christopher Samuel Senior Systems Administrator VLSCI - Victorian Life Sciences Computation Initiative Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 http://www.vlsci.org.au/ http://twitter.com/vlsci -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlNq/K8ACgkQO2KABBYQAh/q0wCcDvYjl4tYVXrHNciCkKgbnwF7 VHoAn3Q+gZXQNKzs++3uajmiGTkq/EeD =ucJg -----END PGP SIGNATURE-----