Interesting - yes, coll sm doesn't think they are on the same node for some reason. Try adding -mca grpcomm_base_verbose 5 and let's see why
On Jul 3, 2012, at 1:24 PM, Juan Antonio Rico Gallego wrote: > The code I run is a simple broadcast. > > When I do not specify components to run, the output is (more verbose): > > [jarico@Metropolis-01 examples]$ > /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --mca > mca_base_verbose 100 --mca mca_coll_base_output 100 --mca coll_sm_priority > 99 -mca hwloc_base_verbose 90 --display-map --mca mca_verbose 100 --mca > mca_base_verbose 100 --mca coll_base_verbose 100 -n 2 ./bmem > [Metropolis-01:24490] mca: base: components_open: Looking for hwloc components > [Metropolis-01:24490] mca: base: components_open: opening hwloc components > [Metropolis-01:24490] mca: base: components_open: found loaded component > hwloc142 > [Metropolis-01:24490] mca: base: components_open: component hwloc142 has no > register function > [Metropolis-01:24490] mca: base: components_open: component hwloc142 has no > open function > [Metropolis-01:24490] hwloc:base:get_topology > [Metropolis-01:24490] hwloc:base: no cpus specified - using root available > cpuset > > ======================== JOB MAP ======================== > > Data for node: Metropolis-01 Num procs: 2 > Process OMPI jobid: [36336,1] App: 0 Process rank: 0 > Process OMPI jobid: [36336,1] App: 0 Process rank: 1 > > ============================================================= > [Metropolis-01:24491] mca: base: components_open: Looking for hwloc components > [Metropolis-01:24491] mca: base: components_open: opening hwloc components > [Metropolis-01:24491] mca: base: components_open: found loaded component > hwloc142 > [Metropolis-01:24491] mca: base: components_open: component hwloc142 has no > register function > [Metropolis-01:24491] mca: base: components_open: component hwloc142 has no > open function > [Metropolis-01:24492] mca: base: components_open: Looking for hwloc components > [Metropolis-01:24492] mca: base: components_open: opening hwloc components > [Metropolis-01:24492] mca: base: components_open: found loaded component > hwloc142 > [Metropolis-01:24492] mca: base: components_open: component hwloc142 has no > register function > [Metropolis-01:24492] mca: base: components_open: component hwloc142 has no > open function > [Metropolis-01:24491] locality: CL:CU:N:B > [Metropolis-01:24491] hwloc:base: get available cpus > [Metropolis-01:24491] hwloc:base:get_available_cpus first time - filtering > cpus > [Metropolis-01:24491] hwloc:base: no cpus specified - using root available > cpuset > [Metropolis-01:24491] hwloc:base:get_available_cpus root object > [Metropolis-01:24491] mca: base: components_open: Looking for coll components > [Metropolis-01:24491] mca: base: components_open: opening coll components > [Metropolis-01:24491] mca: base: components_open: found loaded component tuned > [Metropolis-01:24491] mca: base: components_open: component tuned has no > register function > [Metropolis-01:24491] coll:tuned:component_open: done! > [Metropolis-01:24491] mca: base: components_open: component tuned open > function successful > [Metropolis-01:24491] mca: base: components_open: found loaded component sm > [Metropolis-01:24491] mca: base: components_open: component sm register > function successful > [Metropolis-01:24491] mca: base: components_open: component sm has no open > function > [Metropolis-01:24491] mca: base: components_open: found loaded component > libnbc > [Metropolis-01:24491] mca: base: components_open: component libnbc register > function successful > [Metropolis-01:24491] mca: base: components_open: component libnbc open > function successful > [Metropolis-01:24491] mca: base: components_open: found loaded component > hierarch > [Metropolis-01:24491] mca: base: components_open: component hierarch has no > register function > [Metropolis-01:24491] mca: base: components_open: component hierarch open > function successful > [Metropolis-01:24491] mca: base: components_open: found loaded component basic > [Metropolis-01:24491] mca: base: components_open: component basic register > function successful > [Metropolis-01:24491] mca: base: components_open: component basic has no open > function > [Metropolis-01:24491] mca: base: components_open: found loaded component inter > [Metropolis-01:24491] mca: base: components_open: component inter has no > register function > [Metropolis-01:24491] mca: base: components_open: component inter open > function successful > [Metropolis-01:24491] mca: base: components_open: found loaded component self > [Metropolis-01:24491] mca: base: components_open: component self has no > register function > [Metropolis-01:24491] mca: base: components_open: component self open > function successful > [Metropolis-01:24492] locality: CL:CU:N:B > [Metropolis-01:24492] hwloc:base: get available cpus > [Metropolis-01:24492] hwloc:base:get_available_cpus first time - filtering > cpus > [Metropolis-01:24492] hwloc:base: no cpus specified - using root available > cpuset > [Metropolis-01:24492] hwloc:base:get_available_cpus root object > [Metropolis-01:24492] mca: base: components_open: Looking for coll components > [Metropolis-01:24492] mca: base: components_open: opening coll components > [Metropolis-01:24492] mca: base: components_open: found loaded component tuned > [Metropolis-01:24492] mca: base: components_open: component tuned has no > register function > [Metropolis-01:24492] coll:tuned:component_open: done! > [Metropolis-01:24492] mca: base: components_open: component tuned open > function successful > [Metropolis-01:24492] mca: base: components_open: found loaded component sm > [Metropolis-01:24492] mca: base: components_open: component sm register > function successful > [Metropolis-01:24492] mca: base: components_open: component sm has no open > function > [Metropolis-01:24492] mca: base: components_open: found loaded component > libnbc > [Metropolis-01:24492] mca: base: components_open: component libnbc register > function successful > [Metropolis-01:24492] mca: base: components_open: component libnbc open > function successful > [Metropolis-01:24492] mca: base: components_open: found loaded component > hierarch > [Metropolis-01:24492] mca: base: components_open: component hierarch has no > register function > [Metropolis-01:24492] mca: base: components_open: component hierarch open > function successful > [Metropolis-01:24492] mca: base: components_open: found loaded component basic > [Metropolis-01:24492] mca: base: components_open: component basic register > function successful > [Metropolis-01:24492] mca: base: components_open: component basic has no open > function > [Metropolis-01:24492] mca: base: components_open: found loaded component inter > [Metropolis-01:24492] mca: base: components_open: component inter has no > register function > [Metropolis-01:24492] mca: base: components_open: component inter open > function successful > [Metropolis-01:24492] mca: base: components_open: found loaded component self > [Metropolis-01:24492] mca: base: components_open: component self has no > register function > [Metropolis-01:24492] mca: base: components_open: component self open > function successful > [Metropolis-01:24491] coll:find_available: querying coll component tuned > [Metropolis-01:24491] coll:find_available: coll component tuned is available > [Metropolis-01:24491] coll:find_available: querying coll component sm > [Metropolis-01:24491] coll:sm:init_query: no other local procs; disqualifying > myself > [Metropolis-01:24491] coll:find_available: coll component sm is not available > [Metropolis-01:24491] coll:find_available: querying coll component libnbc > [Metropolis-01:24491] coll:find_available: coll component libnbc is available > [Metropolis-01:24491] coll:find_available: querying coll component hierarch > [Metropolis-01:24491] coll:find_available: coll component hierarch is > available > [Metropolis-01:24491] coll:find_available: querying coll component basic > [Metropolis-01:24491] coll:find_available: coll component basic is available > [Metropolis-01:24491] coll:find_available: querying coll component inter > [Metropolis-01:24492] coll:find_available: querying coll component tuned > [Metropolis-01:24492] coll:find_available: coll component tuned is available > [Metropolis-01:24492] coll:find_available: querying coll component sm > [Metropolis-01:24492] coll:sm:init_query: no other local procs; disqualifying > myself > [Metropolis-01:24492] coll:find_available: coll component sm is not available > [Metropolis-01:24492] coll:find_available: querying coll component libnbc > [Metropolis-01:24492] coll:find_available: coll component libnbc is available > [Metropolis-01:24492] coll:find_available: querying coll component hierarch > [Metropolis-01:24492] coll:find_available: coll component hierarch is > available > [Metropolis-01:24492] coll:find_available: querying coll component basic > [Metropolis-01:24492] coll:find_available: coll component basic is available > [Metropolis-01:24492] coll:find_available: querying coll component inter > [Metropolis-01:24492] coll:find_available: coll component inter is available > [Metropolis-01:24492] coll:find_available: querying coll component self > [Metropolis-01:24492] coll:find_available: coll component self is available > [Metropolis-01:24491] coll:find_available: coll component inter is available > [Metropolis-01:24491] coll:find_available: querying coll component self > [Metropolis-01:24491] coll:find_available: coll component self is available > [Metropolis-01:24492] hwloc:base:get_nbojbs computed data 0 of NUMANode:0 > [Metropolis-01:24491] hwloc:base:get_nbojbs computed data 0 of NUMANode:0 > [Metropolis-01:24491] coll:base:comm_select: new communicator: MPI_COMM_WORLD > (cid 0) > [Metropolis-01:24491] coll:base:comm_select: Checking all available modules > [Metropolis-01:24491] coll:tuned:module_tuned query called > [Metropolis-01:24491] coll:base:comm_select: component available: tuned, > priority: 30 > [Metropolis-01:24491] coll:base:comm_select: component available: libnbc, > priority: 10 > [Metropolis-01:24491] coll:base:comm_select: component not available: hierarch > [Metropolis-01:24491] coll:base:comm_select: component available: basic, > priority: 10 > [Metropolis-01:24491] coll:base:comm_select: component not available: inter > [Metropolis-01:24491] coll:base:comm_select: component not available: self > [Metropolis-01:24491] coll:tuned:module_init called. > [Metropolis-01:24491] coll:tuned:module_init Tuned is in use > [Metropolis-01:24491] coll:base:comm_select: new communicator: MPI_COMM_SELF > (cid 1) > [Metropolis-01:24491] coll:base:comm_select: Checking all available modules > [Metropolis-01:24491] coll:tuned:module_tuned query called > [Metropolis-01:24491] coll:base:comm_select: component not available: tuned > [Metropolis-01:24491] coll:base:comm_select: component available: libnbc, > priority: 10 > [Metropolis-01:24491] coll:base:comm_select: component not available: hierarch > [Metropolis-01:24491] coll:base:comm_select: component available: basic, > priority: 10 > [Metropolis-01:24491] coll:base:comm_select: component not available: inter > [Metropolis-01:24491] coll:base:comm_select: component available: self, > priority: 75 > [Metropolis-01:24492] coll:base:comm_select: new communicator: MPI_COMM_WORLD > (cid 0) > [Metropolis-01:24492] coll:base:comm_select: Checking all available modules > [Metropolis-01:24492] coll:tuned:module_tuned query called > [Metropolis-01:24492] coll:base:comm_select: component available: tuned, > priority: 30 > [Metropolis-01:24492] coll:base:comm_select: component available: libnbc, > priority: 10 > [Metropolis-01:24492] coll:base:comm_select: component not available: hierarch > [Metropolis-01:24492] coll:base:comm_select: component available: basic, > priority: 10 > [Metropolis-01:24492] coll:base:comm_select: component not available: inter > [Metropolis-01:24492] coll:base:comm_select: component not available: self > [Metropolis-01:24492] coll:tuned:module_init called. > [Metropolis-01:24492] coll:tuned:module_init Tuned is in use > [Metropolis-01:24492] coll:base:comm_select: new communicator: MPI_COMM_SELF > (cid 1) > [Metropolis-01:24492] coll:base:comm_select: Checking all available modules > [Metropolis-01:24492] coll:tuned:module_tuned query called > [Metropolis-01:24492] coll:base:comm_select: component not available: tuned > [Metropolis-01:24492] coll:base:comm_select: component available: libnbc, > priority: 10 > [Metropolis-01:24492] coll:base:comm_select: component not available: hierarch > [Metropolis-01:24492] coll:base:comm_select: component available: basic, > priority: 10 > [Metropolis-01:24492] coll:base:comm_select: component not available: inter > [Metropolis-01:24492] coll:base:comm_select: component available: self, > priority: 75 > [Metropolis-01:24491] coll:tuned:component_close: called > [Metropolis-01:24491] coll:tuned:component_close: done! > [Metropolis-01:24492] coll:tuned:component_close: called > [Metropolis-01:24492] coll:tuned:component_close: done! > [Metropolis-01:24492] mca: base: close: component tuned closed > [Metropolis-01:24492] mca: base: close: unloading component tuned > [Metropolis-01:24492] mca: base: close: component libnbc closed > [Metropolis-01:24492] mca: base: close: unloading component libnbc > [Metropolis-01:24492] mca: base: close: unloading component hierarch > [Metropolis-01:24492] mca: base: close: unloading component basic > [Metropolis-01:24492] mca: base: close: unloading component inter > [Metropolis-01:24492] mca: base: close: unloading component self > [Metropolis-01:24491] mca: base: close: component tuned closed > [Metropolis-01:24491] mca: base: close: unloading component tuned > [Metropolis-01:24491] mca: base: close: component libnbc closed > [Metropolis-01:24491] mca: base: close: unloading component libnbc > [Metropolis-01:24491] mca: base: close: unloading component hierarch > [Metropolis-01:24491] mca: base: close: unloading component basic > [Metropolis-01:24491] mca: base: close: unloading component inter > [Metropolis-01:24491] mca: base: close: unloading component self > [jarico@Metropolis-01 examples]$ > > > SM is not load because it detects no other processes in the same machine: > > [Metropolis-01:24491] coll:sm:init_query: no other local procs; disqualifying > myself > > The machine is a multicore machine with 8 cores. > > I need to run SM component code, and I suppose that raising priority it will > be the component selected when problem is solved. > > > > El 03/07/2012, a las 21:01, Jeff Squyres escribió: > >> The issue is that the "sm" coll component only implements a few of the MPI >> collective operations. It is usually mixed at run-time with other coll >> components to fill out the rest of the MPI collective operations. >> >> So what is happening is that OMPI is determining that it doesn't have >> implementations of all the MPI collective operations and aborting. >> >> You shouldn't need to manually select your coll module -- OMPI should >> automatically select the right collective module for you. E.g., if all >> procs are local on a single machine and sm has a matching implementation for >> that MPI collective operation, it'll be used. >> >> >> >> On Jul 3, 2012, at 2:48 PM, Juan Antonio Rico Gallego wrote: >> >>> Output is: >>> >>> [Metropolis-01:15355] hwloc:base:get_topology >>> [Metropolis-01:15355] hwloc:base: no cpus specified - using root available >>> cpuset >>> >>> ======================== JOB MAP ======================== >>> >>> Data for node: Metropolis-01 Num procs: 2 >>> Process OMPI jobid: [59809,1] App: 0 Process rank: 0 >>> Process OMPI jobid: [59809,1] App: 0 Process rank: 1 >>> >>> ============================================================= >>> [Metropolis-01:15356] locality: CL:CU:N:B >>> [Metropolis-01:15356] hwloc:base: get available cpus >>> [Metropolis-01:15356] hwloc:base:get_available_cpus first time - filtering >>> cpus >>> [Metropolis-01:15356] hwloc:base: no cpus specified - using root available >>> cpuset >>> [Metropolis-01:15356] hwloc:base:get_available_cpus root object >>> [Metropolis-01:15357] locality: CL:CU:N:B >>> [Metropolis-01:15357] hwloc:base: get available cpus >>> [Metropolis-01:15357] hwloc:base:get_available_cpus first time - filtering >>> cpus >>> [Metropolis-01:15357] hwloc:base: no cpus specified - using root available >>> cpuset >>> [Metropolis-01:15357] hwloc:base:get_available_cpus root object >>> [Metropolis-01:15356] hwloc:base:get_nbojbs computed data 0 of NUMANode:0 >>> [Metropolis-01:15357] hwloc:base:get_nbojbs computed data 0 of NUMANode:0 >>> >>> >>> Regards, >>> Juan A. Rico >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel