On Apr 12, 2012, at 11:25 , Reuti wrote: >> This may be of limited value since we're using IntelMPI, not MVAPICH, but we >> did face similar problems. By default, iMPI would bind to cores 0..n-1, so >> multiple jobs on one host would step on each other's toes. > > This will also happen then outside of SGE in case you use mpiexec just on the > command line - right?
Yes. > Is each rank bound to a specific core, or just the the set of ranks to a set > of cores and the OS can place them inside this set as it likes? I think there is a 1-on-1 binding, but I am not sure. >> It is possible to disable the binding (I_MPI_PIN=disable), but that would >> also degrade performance badly (we have Magny-Cours CPUs). > > Interesting, this would mean that the scheduler in the OS is not > operating/predicting in the best way. Yes. I guess that the processes start out on occupied nodes (or two jobs start at the same time, on the same nodes). Then they allocate memory, and then they start producing load. The OS scheduler will try to move them to otherwise idle nodes, but that slows down memory access. From what I've seen, such processes tend to switch cores frequently, so the scheduler probably tries to optimize between fast memory and CPU access according to the (shifting) memory access pattern of the processes. A. -- Ansgar Esztermann DV-Systemadministration Max-Planck-Institut für biophysikalische Chemie, Abteilung 105
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
