This may be of limited value since we're using IntelMPI, not MVAPICH, but we did face similar problems. By default, iMPI would bind to cores 0..n-1, so multiple jobs on one host would step on each other's toes. It is possible to disable the binding (I_MPI_PIN=disable), but that would also degrade performance badly (we have Magny-Cours CPUs). Nowadays, we include this in our job scripts:
$ -binding env linear:8
PROC_LIST=${SGE_BINDING// /,}
export I_MPI_PIN_PROCESSOR_LIST=${PROC_LIST%,}
A.
On Apr 11, 2012, at 16:11 , Dave Love wrote:
> The gridengine binding (which gridengine keeps track of) separates jobs,
> and it should be noticed by the MPI, which should then bind the
> individual processes to the cores it's been given. I don't know
> mvapich, but I know it uses hwloc, and should be able to do this
> properly like openmpi does (modulo issues with recent hardware, sigh).
> I thought mvapich would do the right thing automatically -- openmpi is
> said often to look bad performance-wise by not doing core binding by
> default.
--
Ansgar Esztermann
DV-Systemadministration
Max-Planck-Institut für biophysikalische Chemie, Abteilung 105
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
