Ralph, my test VM is single socket four cores. here is something odd i just found when running mpirun -np 2 intercomm_create. tasks [0,1] are bound on cpus [0,1] => OK tasks[2-3] (first spawn) are bound on cpus [2,3] => OK tasks[4-5] (second spawn) are not bound (and cpuset is [0-3]) => OK
in ompi_proc_set_locality (ompi/proc/proc.c:228) on task 0 locality = opal_hwloc_base_get_relative_locality(opal_hwloc_topology, ompi_process_info.cpuset, cpu_bitmap); where ompi_process_info.cpuset is "0" cpu_bitmap is "0-3" and locality is set to OPAL_PROC_ON_HWTHREAD (!) is this correct ? i would have expected OPAL_PROC_ON_L2CACHE (since there is a single L2 cache on my vm, as reported by lstopo) or even OPAL_PROC_LOCALITY_UNKNOWN then in mca_coll_ml_comm_query (ompi/mca/coll/ml/coll_ml_module.c:2899) the module disqualifies itself if !ompi_rte_proc_bound. if locality were previously set to OPAL_PROC_LOCALITY_UNKNOWN, coll/ml could checked the flag of all the procs of the communicator and disqualify itself if at least one of them is OPAL_PROC_LOCALITY_UNKNOWN. as you wrote, there might be a bunch of other corner cases. that being said, i ll try to write a simple proof of concept and see it this specific hang can be avoided Cheers, Gilles On 2014/06/20 12:08, Ralph Castain wrote: > It is related, but it means that coll/ml has a higher degree of sensitivity > to the binding pattern than what you reported (which was that coll/ml doesn't > work with unbound processes). What we are now seeing is that coll/ml also > doesn't work when processes are bound across sockets. > > Which means that Nathan's revised tests are going to have to cover a lot more > corner cases. Our locality flags don't currently include > "bound-to-multiple-sockets", and I'm not sure how he is going to easily > resolve that case. >