Hi, I am facing a problem with a test that runs fine on some nodes, and fails on others.
I have a heterogenous cluster, with 3 types of nodes: 1) Single socket , 4 cores 2) 2 sockets, 4cores per socket 3) 2 sockets, 6 cores/socket I am using: . salloc to allocate the nodes, . mpirun binding/mapping options "-bind-to-socket -bysocket" # salloc -N 1 mpirun -n 4 -bind-to-socket -bysocket sleep 900 This command fails if the allocated node is of type #1 (single socket/4 cpus). BTW, in that case orte_show_help is referencing a tag ("could-not-bind-to-socket") that does not exist in help-odls-default.txt. While it succeeds when run on nodes of type #2 or 3. I think a "bind to socket" should not return an error on a single socket machine, but rather be a noop. The problem comes from the test OPAL_PAFFINITY_PROCESS_IS_BOUND(mask, &bound); called in odls_default_fork_local_proc() after the binding to the processors socket has been done: ======== <snip> OPAL_PAFFINITY_CPU_ZERO(mask); for (n=0; n < orte_default_num_cores_per_socket; n++) { <snip> OPAL_PAFFINITY_CPU_SET(phys_cpu, mask); } /* if we did not bind it anywhere, then that is an error */ OPAL_PAFFINITY_PROCESS_IS_BOUND(mask, &bound); if (!bound) { orte_show_help("help-odls-default.txt", "odls-default:could-not-bind-to-socket", true); ORTE_ODLS_ERROR_OUT(ORTE_ERR_FATAL); } ======== OPAL_PAFFINITY_PROCESS_IS_BOUND() will return true if there bits set in the mask *AND* the number of bits set is lesser than the number of cpus on the machine. Thus on a single socket, 4 cores machine the test will fail. While on other the kinds of machines it will succeed. Again, I think the problem could be solved by changing the alogrithm, and assuming that ORTE_BIND_TO_SOCKET, on a single socket machine = noop. Another solution could be to call the test OPAL_PAFFINITY_PROCESS_IS_BOUND() at the end of the loop only if we are bound (orte_odls_globals.bound). Actually that is the only case where I see a justification to this test (see attached patch). And may be both solutions could be mixed. Regards, Nadia -- Nadia Derbey <nadia.der...@bull.net>
Do not test actual process binding in obvious cases diff -r 0b851b2e7934 orte/mca/odls/default/odls_default_module.c --- a/orte/mca/odls/default/odls_default_module.c Thu Mar 18 16:10:25 2010 +0100 +++ b/orte/mca/odls/default/odls_default_module.c Fri Apr 09 11:38:28 2010 +0200 @@ -747,12 +747,16 @@ static int odls_default_fork_local_proc( target_socket, phys_core, phys_cpu)); OPAL_PAFFINITY_CPU_SET(phys_cpu, mask); } - /* if we did not bind it anywhere, then that is an error */ - OPAL_PAFFINITY_PROCESS_IS_BOUND(mask, &bound); - if (!bound) { - orte_show_help("help-odls-default.txt", - "odls-default:could-not-bind-to-socket", true); - ORTE_ODLS_ERROR_OUT(ORTE_ERR_FATAL); + /* if we actually did not bind it anywhere and it was + * originally bound then that is an error + */ + if (orte_odls_globals.bound) { + OPAL_PAFFINITY_PROCESS_IS_BOUND(mask, &bound); + if (!bound) { + orte_show_help("help-odls-default.txt", + "odls-default:could-not-bind-to-socket", true); + ORTE_ODLS_ERROR_OUT(ORTE_ERR_FATAL); + } } if (orte_report_bindings) { opal_output(0, "%s odls:default:fork binding child %s to socket %d cpus %04lx",