On Fri, 2010-04-09 at 14:23 -0400, Terry Dontje wrote: > Ralph Castain wrote: > > Okay, just wanted to ensure everyone was working from the same base > > code. > > > > > > Terry, Brad: you might want to look this proposed change over. > > Something doesn't quite look right to me, but I haven't really > > walked through the code to check it. > > > > > At first blush I don't really get the usage of orte_odls_globals.bound > in you patch. It would seem to me that the insertion of that > conditional would prevent the check it surrounds being done when the > process has not been bounded prior to startup which is a common case.
Well, if you have a look at the algo in the ORTE_BIND_TO_SOCKET path (odls_default_fork_local_proc() in odls_default_module.c): <set target_socket depending on the desired mapping> <set my paffinity mask to 0> (line 715) <for each core in the socket> { <get the associated phys_core> <get the associated phys_cpu> <if we are bound (orte_odls_globals.bound)> { <if phys_cpu does not belong to the cpus I'm bound to> continue } <set phys-cpu bit in my affinity mask> } <check if something is set in my affinity mask> ... What I'm saying is that the only way to have nothing set in the affinity mask (which would justify the last test) is to have never called the <set phys_cpu in my affinity mask> instruction. This means: . the test on orte_odls_globals.bound is true . call <continue> for all the cores in the socket. In the other path, what we are doing is checking if we have set one or more bits in a mask after having actually set them: don't you think it's useless? That's why I'm suggesting to call the last check only if orte_odls_globals.bound is true. Regards, Nadia > > --td > > > > > > On Apr 9, 2010, at 9:33 AM, Terry Dontje wrote: > > > > > Nadia Derbey wrote: > > > > On Fri, 2010-04-09 at 08:41 -0600, Ralph Castain wrote: > > > > > > > > > Just to check: is this with the latest trunk? Brad and Terry have > > > > > been making changes to this section of code, including modifying the > > > > > PROCESS_IS_BOUND test... > > > > > > > > > > > > > > > > > > > > > > > Well, it was on the v1.5. But I just checked: looks like > > > > 1. the call to OPAL_PAFFINITY_PROCESS_IS_BOUND is still there in > > > > odls_default_fork_local_proc() > > > > 2. OPAL_PAFFINITY_PROCESS_IS_BOUND() is defined the same way > > > > > > > > But, I'll give it a try with the latest trunk. > > > > > > > > Regards, > > > > Nadia > > > > > > > > > > > The changes, I've done do not touch > > > OPAL_PAFFINITY_PROCESS_IS_BOUND at all. Also, I am only touching > > > code related to the "bind-to-core" option so I really doubt if my > > > changes are causing issues here. > > > > > > --td > > > > > On Apr 9, 2010, at 3:39 AM, Nadia Derbey wrote: > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > I am facing a problem with a test that runs fine on some nodes, and > > > > > > fails on others. > > > > > > > > > > > > I have a heterogenous cluster, with 3 types of nodes: > > > > > > 1) Single socket , 4 cores > > > > > > 2) 2 sockets, 4cores per socket > > > > > > 3) 2 sockets, 6 cores/socket > > > > > > > > > > > > I am using: > > > > > > . salloc to allocate the nodes, > > > > > > . mpirun binding/mapping options "-bind-to-socket -bysocket" > > > > > > > > > > > > # salloc -N 1 mpirun -n 4 -bind-to-socket -bysocket sleep 900 > > > > > > > > > > > > This command fails if the allocated node is of type #1 (single > > > > > > socket/4 > > > > > > cpus). > > > > > > BTW, in that case orte_show_help is referencing a tag > > > > > > ("could-not-bind-to-socket") that does not exist in > > > > > > help-odls-default.txt. > > > > > > > > > > > > While it succeeds when run on nodes of type #2 or 3. > > > > > > I think a "bind to socket" should not return an error on a single > > > > > > socket > > > > > > machine, but rather be a noop. > > > > > > > > > > > > The problem comes from the test > > > > > > OPAL_PAFFINITY_PROCESS_IS_BOUND(mask, &bound); > > > > > > called in odls_default_fork_local_proc() after the binding to the > > > > > > processors socket has been done: > > > > > > ======== > > > > > > <snip> > > > > > > OPAL_PAFFINITY_CPU_ZERO(mask); > > > > > > for (n=0; n < orte_default_num_cores_per_socket; n++) { > > > > > > <snip> > > > > > > OPAL_PAFFINITY_CPU_SET(phys_cpu, mask); > > > > > > } > > > > > > /* if we did not bind it anywhere, then that is an error */ > > > > > > OPAL_PAFFINITY_PROCESS_IS_BOUND(mask, &bound); > > > > > > if (!bound) { > > > > > > orte_show_help("help-odls-default.txt", > > > > > > "odls-default:could-not-bind-to-socket", > > > > > > true); > > > > > > ORTE_ODLS_ERROR_OUT(ORTE_ERR_FATAL); > > > > > > } > > > > > > ======== > > > > > > OPAL_PAFFINITY_PROCESS_IS_BOUND() will return true if there bits > > > > > > set in > > > > > > the mask *AND* the number of bits set is lesser than the number of > > > > > > cpus > > > > > > on the machine. Thus on a single socket, 4 cores machine the test > > > > > > will > > > > > > fail. While on other the kinds of machines it will succeed. > > > > > > > > > > > > Again, I think the problem could be solved by changing the > > > > > > alogrithm, > > > > > > and assuming that ORTE_BIND_TO_SOCKET, on a single socket machine = > > > > > > noop. > > > > > > > > > > > > Another solution could be to call the test > > > > > > OPAL_PAFFINITY_PROCESS_IS_BOUND() at the end of the loop only if we > > > > > > are > > > > > > bound (orte_odls_globals.bound). Actually that is the only case > > > > > > where I > > > > > > see a justification to this test (see attached patch). > > > > > > > > > > > > And may be both solutions could be mixed. > > > > > > > > > > > > Regards, > > > > > > Nadia > > > > > > > > > > > > > > > > > > -- > > > > > > Nadia Derbey <nadia.der...@bull.net> > > > > > > <001_fix_process_binding_test.patch>_______________________________________________ > > > > > > devel mailing list > > > > > > de...@open-mpi.org > > > > > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > > > > > > > > _______________________________________________ > > > > > devel mailing list > > > > > de...@open-mpi.org > > > > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > > > > > > > > > > > > > > > > -- > > > <Mail Attachment.gif> > > > Terry D. Dontje | Principal Software Engineer > > > Developer Tools Engineering | +1.650.633.7054 > > > Oracle - Performance Technologies > > > 95 Network Drive, Burlington, MA 01803 > > > Email terry.don...@oracle.com > > > > > > > > > _______________________________________________ > > > devel mailing list > > > de...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > > > > > ____________________________________________________________________ > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > -- > Oracle > Terry D. Dontje | Principal Software Engineer > Developer Tools Engineering | +1.650.633.7054 > Oracle - Performance Technologies > 95 Network Drive, Burlington, MA 01803 > Email terry.don...@oracle.com > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Nadia Derbey <nadia.der...@bull.net>