Here is the output. [jarico@Metropolis-01 examples]$ /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --bind-to-core --bynode --mca mca_base_verbose 100 --mca mca_coll_base_output 100 --mca coll_sm_priority 99 -mca hwloc_base_verbose 90 --display-map --mca mca_verbose 100 --mca mca_base_verbose 100 --mca coll_base_verbose 100 -n 2 -mca grpcomm_base_verbose 5 ./bmem [Metropolis-01:24563] mca: base: components_open: Looking for hwloc components [Metropolis-01:24563] mca: base: components_open: opening hwloc components [Metropolis-01:24563] mca: base: components_open: found loaded component hwloc142 [Metropolis-01:24563] mca: base: components_open: component hwloc142 has no register function [Metropolis-01:24563] mca: base: components_open: component hwloc142 has no open function [Metropolis-01:24563] hwloc:base:get_topology [Metropolis-01:24563] hwloc:base: no cpus specified - using root available cpuset [Metropolis-01:24563] mca:base:select:(grpcomm) Querying component [bad] [Metropolis-01:24563] mca:base:select:(grpcomm) Query of component [bad] set priority to 10 [Metropolis-01:24563] mca:base:select:(grpcomm) Selected component [bad] [Metropolis-01:24563] [[36265,0],0] grpcomm:base:receive start comm -------------------------------------------------------------------------- WARNING: a request was made to bind a process. While the system supports binding the process itself, at least one node does NOT support binding memory to the process location.
Node: Metropolis-01 This is a warning only; your job will continue, though performance may be degraded. -------------------------------------------------------------------------- [Metropolis-01:24563] hwloc:base: get available cpus [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done [Metropolis-01:24563] hwloc:base: get available cpus [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done [Metropolis-01:24563] hwloc:base: get available cpus [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done [Metropolis-01:24563] hwloc:base: get available cpus [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done [Metropolis-01:24563] hwloc:base: get available cpus [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done [Metropolis-01:24563] hwloc:base: get available cpus [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done [Metropolis-01:24563] hwloc:base: get available cpus [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done [Metropolis-01:24563] hwloc:base: get available cpus [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done [Metropolis-01:24563] hwloc:base:get_nbojbs computed data 8 of Core:0 [Metropolis-01:24563] hwloc:base: get available cpus [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done [Metropolis-01:24563] hwloc:base: get available cpus [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done ======================== JOB MAP ======================== Data for node: Metropolis-01 Num procs: 2 Process OMPI jobid: [36265,1] App: 0 Process rank: 0 Process OMPI jobid: [36265,1] App: 0 Process rank: 1 ============================================================= [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,0] tag 1 [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay [Metropolis-01:24563] [[36265,0],0] grpcomm:base:xcast updating daemon nidmap [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list is empty! [Metropolis-01:24564] mca: base: components_open: Looking for hwloc components [Metropolis-01:24564] mca: base: components_open: opening hwloc components [Metropolis-01:24564] mca: base: components_open: found loaded component hwloc142 [Metropolis-01:24564] mca: base: components_open: component hwloc142 has no register function [Metropolis-01:24564] mca: base: components_open: component hwloc142 has no open function [Metropolis-01:24565] mca: base: components_open: Looking for hwloc components [Metropolis-01:24565] mca: base: components_open: opening hwloc components [Metropolis-01:24565] mca: base: components_open: found loaded component hwloc142 [Metropolis-01:24565] mca: base: components_open: component hwloc142 has no register function [Metropolis-01:24565] mca: base: components_open: component hwloc142 has no open function [Metropolis-01:24564] mca:base:select:(grpcomm) Querying component [bad] [Metropolis-01:24564] mca:base:select:(grpcomm) Query of component [bad] set priority to 10 [Metropolis-01:24564] mca:base:select:(grpcomm) Selected component [bad] [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive start comm [Metropolis-01:24564] computing locality - getting object at level CORE, index 0 [Metropolis-01:24564] hwloc:base: get available cpus [Metropolis-01:24564] hwloc:base:get_available_cpus first time - filtering cpus [Metropolis-01:24564] hwloc:base: no cpus specified - using root available cpuset [Metropolis-01:24564] computing locality - getting object at level CORE, index 1 [Metropolis-01:24564] hwloc:base: get available cpus [Metropolis-01:24564] hwloc:base:filter_cpus specified - already done [Metropolis-01:24564] computing locality - shifting up from L1CACHE [Metropolis-01:24564] computing locality - shifting up from L2CACHE [Metropolis-01:24564] computing locality - shifting up from L3CACHE [Metropolis-01:24564] computing locality - filling level SOCKET [Metropolis-01:24564] computing locality - filling level NUMA [Metropolis-01:24564] locality: CL:CU:N:B:Nu:S [Metropolis-01:24565] mca:base:select:(grpcomm) Querying component [bad] [Metropolis-01:24565] mca:base:select:(grpcomm) Query of component [bad] set priority to 10 [Metropolis-01:24565] mca:base:select:(grpcomm) Selected component [bad] [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive start comm [Metropolis-01:24564] mca: base: components_open: Looking for coll components [Metropolis-01:24564] mca: base: components_open: opening coll components [Metropolis-01:24564] mca: base: components_open: found loaded component tuned [Metropolis-01:24564] mca: base: components_open: component tuned has no register function [Metropolis-01:24564] coll:tuned:component_open: done! [Metropolis-01:24564] mca: base: components_open: component tuned open function successful [Metropolis-01:24564] mca: base: components_open: found loaded component sm [Metropolis-01:24564] mca: base: components_open: component sm register function successful [Metropolis-01:24564] mca: base: components_open: component sm has no open function [Metropolis-01:24564] mca: base: components_open: found loaded component libnbc [Metropolis-01:24564] mca: base: components_open: component libnbc register function successful [Metropolis-01:24564] mca: base: components_open: component libnbc open function successful [Metropolis-01:24564] mca: base: components_open: found loaded component hierarch [Metropolis-01:24564] mca: base: components_open: component hierarch has no register function [Metropolis-01:24564] mca: base: components_open: component hierarch open function successful [Metropolis-01:24564] mca: base: components_open: found loaded component basic [Metropolis-01:24564] mca: base: components_open: component basic register function successful [Metropolis-01:24564] mca: base: components_open: component basic has no open function [Metropolis-01:24564] mca: base: components_open: found loaded component inter [Metropolis-01:24564] mca: base: components_open: component inter has no register function [Metropolis-01:24564] mca: base: components_open: component inter open function successful [Metropolis-01:24564] mca: base: components_open: found loaded component self [Metropolis-01:24564] mca: base: components_open: component self has no register function [Metropolis-01:24564] mca: base: components_open: component self open function successful [Metropolis-01:24565] computing locality - getting object at level CORE, index 1 [Metropolis-01:24565] hwloc:base: get available cpus [Metropolis-01:24565] hwloc:base:get_available_cpus first time - filtering cpus [Metropolis-01:24565] hwloc:base: no cpus specified - using root available cpuset [Metropolis-01:24565] hwloc:base: get available cpus [Metropolis-01:24565] hwloc:base:filter_cpus specified - already done [Metropolis-01:24565] computing locality - getting object at level CORE, index 0 [Metropolis-01:24565] computing locality - shifting up from L1CACHE [Metropolis-01:24565] computing locality - shifting up from L2CACHE [Metropolis-01:24565] computing locality - shifting up from L3CACHE [Metropolis-01:24565] computing locality - filling level SOCKET [Metropolis-01:24565] computing locality - filling level NUMA [Metropolis-01:24565] locality: CL:CU:N:B:Nu:S [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0] [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0 [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO PARTICIPANTS [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0 [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0 [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2 [Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: performing modex [Metropolis-01:24564] [[36265,1],0] grpcomm:base:pack_modex: reporting 4 entries [Metropolis-01:24564] [[36265,1],0] grpcomm:base:full:modex: executing allgather [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering allgather [Metropolis-01:24564] [[36265,1],0] grpcomm:bad allgather underway [Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: modex posted [Metropolis-01:24565] mca: base: components_open: Looking for coll components [Metropolis-01:24565] mca: base: components_open: opening coll components [Metropolis-01:24565] mca: base: components_open: found loaded component tuned [Metropolis-01:24565] mca: base: components_open: component tuned has no register function [Metropolis-01:24565] coll:tuned:component_open: done! [Metropolis-01:24565] mca: base: components_open: component tuned open function successful [Metropolis-01:24565] mca: base: components_open: found loaded component sm [Metropolis-01:24565] mca: base: components_open: component sm register function successful [Metropolis-01:24565] mca: base: components_open: component sm has no open function [Metropolis-01:24565] mca: base: components_open: found loaded component libnbc [Metropolis-01:24565] mca: base: components_open: component libnbc register function successful [Metropolis-01:24565] mca: base: components_open: component libnbc open function successful [Metropolis-01:24565] mca: base: components_open: found loaded component hierarch [Metropolis-01:24565] mca: base: components_open: component hierarch has no register function [Metropolis-01:24565] mca: base: components_open: component hierarch open function successful [Metropolis-01:24565] mca: base: components_open: found loaded component basic [Metropolis-01:24565] mca: base: components_open: component basic register function successful [Metropolis-01:24565] mca: base: components_open: component basic has no open function [Metropolis-01:24565] mca: base: components_open: found loaded component inter [Metropolis-01:24565] mca: base: components_open: component inter has no register function [Metropolis-01:24565] mca: base: components_open: component inter open function successful [Metropolis-01:24565] mca: base: components_open: found loaded component self [Metropolis-01:24565] mca: base: components_open: component self has no register function [Metropolis-01:24565] mca: base: components_open: component self open function successful [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1] [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0 [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0 [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0 [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2 [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 0 LOCALLY COMPLETE - SENDING TO GLOBAL COLLECTIVE [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon collective recvd from [[36265,0],0] [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING COLLECTIVE 0 [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 2 [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,1] tag 30 [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list is empty! [Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: performing modex [Metropolis-01:24565] [[36265,1],1] grpcomm:base:pack_modex: reporting 4 entries [Metropolis-01:24565] [[36265,1],1] grpcomm:base:full:modex: executing allgather [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering allgather [Metropolis-01:24565] [[36265,1],1] grpcomm:bad allgather underway [Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: modex posted [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing collective return for id 0 [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 0 [Metropolis-01:24564] [[36265,1],0] STORING MODEX DATA [Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex entry for proc [[36265,1],0] [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing collective return for id 0 [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 0 [Metropolis-01:24565] [[36265,1],1] STORING MODEX DATA [Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex entry for proc [[36265,1],0] [Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries: adding 4 entries for proc [[36265,1],0] [Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex entry for proc [[36265,1],1] [Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries: adding 4 entries for proc [[36265,1],1] [Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries: adding 4 entries for proc [[36265,1],0] [Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex entry for proc [[36265,1],1] [Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries: adding 4 entries for proc [[36265,1],1] [Metropolis-01:24564] coll:find_available: querying coll component tuned [Metropolis-01:24564] coll:find_available: coll component tuned is available [Metropolis-01:24565] coll:find_available: querying coll component tuned [Metropolis-01:24565] coll:find_available: coll component tuned is available [Metropolis-01:24565] coll:find_available: querying coll component sm [Metropolis-01:24564] coll:find_available: querying coll component sm [Metropolis-01:24564] coll:sm:init_query: no other local procs; disqualifying myself [Metropolis-01:24564] coll:find_available: coll component sm is not available [Metropolis-01:24564] coll:find_available: querying coll component libnbc [Metropolis-01:24564] coll:find_available: coll component libnbc is available [Metropolis-01:24564] coll:find_available: querying coll component hierarch [Metropolis-01:24564] coll:find_available: coll component hierarch is available [Metropolis-01:24564] coll:find_available: querying coll component basic [Metropolis-01:24564] coll:find_available: coll component basic is available [Metropolis-01:24565] coll:sm:init_query: no other local procs; disqualifying myself [Metropolis-01:24565] coll:find_available: coll component sm is not available [Metropolis-01:24565] coll:find_available: querying coll component libnbc [Metropolis-01:24565] coll:find_available: coll component libnbc is available [Metropolis-01:24565] coll:find_available: querying coll component hierarch [Metropolis-01:24565] coll:find_available: coll component hierarch is available [Metropolis-01:24565] coll:find_available: querying coll component basic [Metropolis-01:24565] coll:find_available: coll component basic is available [Metropolis-01:24564] coll:find_available: querying coll component inter [Metropolis-01:24564] coll:find_available: coll component inter is available [Metropolis-01:24564] coll:find_available: querying coll component self [Metropolis-01:24564] coll:find_available: coll component self is available [Metropolis-01:24565] coll:find_available: querying coll component inter [Metropolis-01:24565] coll:find_available: coll component inter is available [Metropolis-01:24565] coll:find_available: querying coll component self [Metropolis-01:24565] coll:find_available: coll component self is available [Metropolis-01:24565] hwloc:base:get_nbojbs computed data 0 of NUMANode:0 [Metropolis-01:24564] hwloc:base:get_nbojbs computed data 0 of NUMANode:0 [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1] [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1 [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO PARTICIPANTS [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1 [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1 [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2 [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0] [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1 [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1 [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1 [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2 [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 1 LOCALLY COMPLETE - SENDING TO GLOBAL COLLECTIVE [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon collective recvd from [[36265,0],0] [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING COLLECTIVE 1 [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 2 [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,1] tag 30 [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list is empty! [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier [Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier [Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing collective return for id 1 [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 1 [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing collective return for id 1 [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 1 [Metropolis-01:24565] coll:base:comm_select: new communicator: MPI_COMM_WORLD (cid 0) [Metropolis-01:24565] coll:base:comm_select: Checking all available modules [Metropolis-01:24565] coll:tuned:module_tuned query called [Metropolis-01:24565] coll:base:comm_select: component available: tuned, priority: 30 [Metropolis-01:24565] coll:base:comm_select: component available: libnbc, priority: 10 [Metropolis-01:24565] coll:base:comm_select: component not available: hierarch [Metropolis-01:24565] coll:base:comm_select: component available: basic, priority: 10 [Metropolis-01:24565] coll:base:comm_select: component not available: inter [Metropolis-01:24565] coll:base:comm_select: component not available: self [Metropolis-01:24565] coll:tuned:module_init called. [Metropolis-01:24565] coll:tuned:module_init Tuned is in use [Metropolis-01:24565] coll:base:comm_select: new communicator: MPI_COMM_SELF (cid 1) [Metropolis-01:24565] coll:base:comm_select: Checking all available modules [Metropolis-01:24564] coll:base:comm_select: new communicator: MPI_COMM_WORLD (cid 0) [Metropolis-01:24564] coll:base:comm_select: Checking all available modules [Metropolis-01:24564] coll:tuned:module_tuned query called [Metropolis-01:24564] coll:base:comm_select: component available: tuned, priority: 30 [Metropolis-01:24564] coll:base:comm_select: component available: libnbc, priority: 10 [Metropolis-01:24564] coll:base:comm_select: component not available: hierarch [Metropolis-01:24564] coll:base:comm_select: component available: basic, priority: 10 [Metropolis-01:24564] coll:base:comm_select: component not available: inter [Metropolis-01:24564] coll:base:comm_select: component not available: self [Metropolis-01:24564] coll:tuned:module_init called. [Metropolis-01:24565] coll:tuned:module_tuned query called [Metropolis-01:24565] coll:base:comm_select: component not available: tuned [Metropolis-01:24565] coll:base:comm_select: component available: libnbc, priority: 10 [Metropolis-01:24565] coll:base:comm_select: component not available: hierarch [Metropolis-01:24565] coll:base:comm_select: component available: basic, priority: 10 [Metropolis-01:24565] coll:base:comm_select: component not available: inter [Metropolis-01:24565] coll:base:comm_select: component available: self, priority: 75 [Metropolis-01:24564] coll:tuned:module_init Tuned is in use [Metropolis-01:24564] coll:base:comm_select: new communicator: MPI_COMM_SELF (cid 1) [Metropolis-01:24564] coll:base:comm_select: Checking all available modules [Metropolis-01:24564] coll:tuned:module_tuned query called [Metropolis-01:24564] coll:base:comm_select: component not available: tuned [Metropolis-01:24564] coll:base:comm_select: component available: libnbc, priority: 10 [Metropolis-01:24564] coll:base:comm_select: component not available: hierarch [Metropolis-01:24564] coll:base:comm_select: component available: basic, priority: 10 [Metropolis-01:24564] coll:base:comm_select: component not available: inter [Metropolis-01:24564] coll:base:comm_select: component available: self, priority: 75 [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1] [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2 [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO PARTICIPANTS [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2 [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2 [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2 [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0] [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2 [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2 [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2 [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2 [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 2 LOCALLY COMPLETE - SENDING TO GLOBAL COLLECTIVE [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon collective recvd from [[36265,0],0] [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING COLLECTIVE 2 [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 2 [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,1] tag 30 [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list is empty! [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier [Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing collective return for id 2 [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 2 [Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing collective return for id 2 [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 2 [Metropolis-01:24565] coll:tuned:component_close: called [Metropolis-01:24565] coll:tuned:component_close: done! [Metropolis-01:24565] mca: base: close: component tuned closed [Metropolis-01:24565] mca: base: close: unloading component tuned [Metropolis-01:24565] mca: base: close: component libnbc closed [Metropolis-01:24565] mca: base: close: unloading component libnbc [Metropolis-01:24565] mca: base: close: unloading component hierarch [Metropolis-01:24565] mca: base: close: unloading component basic [Metropolis-01:24565] mca: base: close: unloading component inter [Metropolis-01:24565] mca: base: close: unloading component self [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive stop comm [Metropolis-01:24564] coll:tuned:component_close: called [Metropolis-01:24564] coll:tuned:component_close: done! [Metropolis-01:24564] mca: base: close: component tuned closed [Metropolis-01:24564] mca: base: close: unloading component tuned [Metropolis-01:24564] mca: base: close: component libnbc closed [Metropolis-01:24564] mca: base: close: unloading component libnbc [Metropolis-01:24564] mca: base: close: unloading component hierarch [Metropolis-01:24564] mca: base: close: unloading component basic [Metropolis-01:24564] mca: base: close: unloading component inter [Metropolis-01:24564] mca: base: close: unloading component self [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive stop comm [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,0] tag 1 [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list is empty! [jarico@Metropolis-01 examples]$ El 03/07/2012, a las 21:44, Ralph Castain escribió: > Interesting - yes, coll sm doesn't think they are on the same node for some > reason. Try adding -mca grpcomm_base_verbose 5 and let's see why > > > On Jul 3, 2012, at 1:24 PM, Juan Antonio Rico Gallego wrote: > >> The code I run is a simple broadcast. >> >> When I do not specify components to run, the output is (more verbose): >> >> [jarico@Metropolis-01 examples]$ >> /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --mca >> mca_base_verbose 100 --mca mca_coll_base_output 100 --mca coll_sm_priority >> 99 -mca hwloc_base_verbose 90 --display-map --mca mca_verbose 100 --mca >> mca_base_verbose 100 --mca coll_base_verbose 100 -n 2 ./bmem >> [Metropolis-01:24490] mca: base: components_open: Looking for hwloc >> components >> [Metropolis-01:24490] mca: base: components_open: opening hwloc components >> [Metropolis-01:24490] mca: base: components_open: found loaded component >> hwloc142 >> [Metropolis-01:24490] mca: base: components_open: component hwloc142 has no >> register function >> [Metropolis-01:24490] mca: base: components_open: component hwloc142 has no >> open function >> [Metropolis-01:24490] hwloc:base:get_topology >> [Metropolis-01:24490] hwloc:base: no cpus specified - using root available >> cpuset >> >> ======================== JOB MAP ======================== >> >> Data for node: Metropolis-01 Num procs: 2 >> Process OMPI jobid: [36336,1] App: 0 Process rank: 0 >> Process OMPI jobid: [36336,1] App: 0 Process rank: 1 >> >> ============================================================= >> [Metropolis-01:24491] mca: base: components_open: Looking for hwloc >> components >> [Metropolis-01:24491] mca: base: components_open: opening hwloc components >> [Metropolis-01:24491] mca: base: components_open: found loaded component >> hwloc142 >> [Metropolis-01:24491] mca: base: components_open: component hwloc142 has no >> register function >> [Metropolis-01:24491] mca: base: components_open: component hwloc142 has no >> open function >> [Metropolis-01:24492] mca: base: components_open: Looking for hwloc >> components >> [Metropolis-01:24492] mca: base: components_open: opening hwloc components >> [Metropolis-01:24492] mca: base: components_open: found loaded component >> hwloc142 >> [Metropolis-01:24492] mca: base: components_open: component hwloc142 has no >> register function >> [Metropolis-01:24492] mca: base: components_open: component hwloc142 has no >> open function >> [Metropolis-01:24491] locality: CL:CU:N:B >> [Metropolis-01:24491] hwloc:base: get available cpus >> [Metropolis-01:24491] hwloc:base:get_available_cpus first time - filtering >> cpus >> [Metropolis-01:24491] hwloc:base: no cpus specified - using root available >> cpuset >> [Metropolis-01:24491] hwloc:base:get_available_cpus root object >> [Metropolis-01:24491] mca: base: components_open: Looking for coll components >> [Metropolis-01:24491] mca: base: components_open: opening coll components >> [Metropolis-01:24491] mca: base: components_open: found loaded component >> tuned >> [Metropolis-01:24491] mca: base: components_open: component tuned has no >> register function >> [Metropolis-01:24491] coll:tuned:component_open: done! >> [Metropolis-01:24491] mca: base: components_open: component tuned open >> function successful >> [Metropolis-01:24491] mca: base: components_open: found loaded component sm >> [Metropolis-01:24491] mca: base: components_open: component sm register >> function successful >> [Metropolis-01:24491] mca: base: components_open: component sm has no open >> function >> [Metropolis-01:24491] mca: base: components_open: found loaded component >> libnbc >> [Metropolis-01:24491] mca: base: components_open: component libnbc register >> function successful >> [Metropolis-01:24491] mca: base: components_open: component libnbc open >> function successful >> [Metropolis-01:24491] mca: base: components_open: found loaded component >> hierarch >> [Metropolis-01:24491] mca: base: components_open: component hierarch has no >> register function >> [Metropolis-01:24491] mca: base: components_open: component hierarch open >> function successful >> [Metropolis-01:24491] mca: base: components_open: found loaded component >> basic >> [Metropolis-01:24491] mca: base: components_open: component basic register >> function successful >> [Metropolis-01:24491] mca: base: components_open: component basic has no >> open function >> [Metropolis-01:24491] mca: base: components_open: found loaded component >> inter >> [Metropolis-01:24491] mca: base: components_open: component inter has no >> register function >> [Metropolis-01:24491] mca: base: components_open: component inter open >> function successful >> [Metropolis-01:24491] mca: base: components_open: found loaded component self >> [Metropolis-01:24491] mca: base: components_open: component self has no >> register function >> [Metropolis-01:24491] mca: base: components_open: component self open >> function successful >> [Metropolis-01:24492] locality: CL:CU:N:B >> [Metropolis-01:24492] hwloc:base: get available cpus >> [Metropolis-01:24492] hwloc:base:get_available_cpus first time - filtering >> cpus >> [Metropolis-01:24492] hwloc:base: no cpus specified - using root available >> cpuset >> [Metropolis-01:24492] hwloc:base:get_available_cpus root object >> [Metropolis-01:24492] mca: base: components_open: Looking for coll components >> [Metropolis-01:24492] mca: base: components_open: opening coll components >> [Metropolis-01:24492] mca: base: components_open: found loaded component >> tuned >> [Metropolis-01:24492] mca: base: components_open: component tuned has no >> register function >> [Metropolis-01:24492] coll:tuned:component_open: done! >> [Metropolis-01:24492] mca: base: components_open: component tuned open >> function successful >> [Metropolis-01:24492] mca: base: components_open: found loaded component sm >> [Metropolis-01:24492] mca: base: components_open: component sm register >> function successful >> [Metropolis-01:24492] mca: base: components_open: component sm has no open >> function >> [Metropolis-01:24492] mca: base: components_open: found loaded component >> libnbc >> [Metropolis-01:24492] mca: base: components_open: component libnbc register >> function successful >> [Metropolis-01:24492] mca: base: components_open: component libnbc open >> function successful >> [Metropolis-01:24492] mca: base: components_open: found loaded component >> hierarch >> [Metropolis-01:24492] mca: base: components_open: component hierarch has no >> register function >> [Metropolis-01:24492] mca: base: components_open: component hierarch open >> function successful >> [Metropolis-01:24492] mca: base: components_open: found loaded component >> basic >> [Metropolis-01:24492] mca: base: components_open: component basic register >> function successful >> [Metropolis-01:24492] mca: base: components_open: component basic has no >> open function >> [Metropolis-01:24492] mca: base: components_open: found loaded component >> inter >> [Metropolis-01:24492] mca: base: components_open: component inter has no >> register function >> [Metropolis-01:24492] mca: base: components_open: component inter open >> function successful >> [Metropolis-01:24492] mca: base: components_open: found loaded component self >> [Metropolis-01:24492] mca: base: components_open: component self has no >> register function >> [Metropolis-01:24492] mca: base: components_open: component self open >> function successful >> [Metropolis-01:24491] coll:find_available: querying coll component tuned >> [Metropolis-01:24491] coll:find_available: coll component tuned is available >> [Metropolis-01:24491] coll:find_available: querying coll component sm >> [Metropolis-01:24491] coll:sm:init_query: no other local procs; >> disqualifying myself >> [Metropolis-01:24491] coll:find_available: coll component sm is not available >> [Metropolis-01:24491] coll:find_available: querying coll component libnbc >> [Metropolis-01:24491] coll:find_available: coll component libnbc is available >> [Metropolis-01:24491] coll:find_available: querying coll component hierarch >> [Metropolis-01:24491] coll:find_available: coll component hierarch is >> available >> [Metropolis-01:24491] coll:find_available: querying coll component basic >> [Metropolis-01:24491] coll:find_available: coll component basic is available >> [Metropolis-01:24491] coll:find_available: querying coll component inter >> [Metropolis-01:24492] coll:find_available: querying coll component tuned >> [Metropolis-01:24492] coll:find_available: coll component tuned is available >> [Metropolis-01:24492] coll:find_available: querying coll component sm >> [Metropolis-01:24492] coll:sm:init_query: no other local procs; >> disqualifying myself >> [Metropolis-01:24492] coll:find_available: coll component sm is not available >> [Metropolis-01:24492] coll:find_available: querying coll component libnbc >> [Metropolis-01:24492] coll:find_available: coll component libnbc is available >> [Metropolis-01:24492] coll:find_available: querying coll component hierarch >> [Metropolis-01:24492] coll:find_available: coll component hierarch is >> available >> [Metropolis-01:24492] coll:find_available: querying coll component basic >> [Metropolis-01:24492] coll:find_available: coll component basic is available >> [Metropolis-01:24492] coll:find_available: querying coll component inter >> [Metropolis-01:24492] coll:find_available: coll component inter is available >> [Metropolis-01:24492] coll:find_available: querying coll component self >> [Metropolis-01:24492] coll:find_available: coll component self is available >> [Metropolis-01:24491] coll:find_available: coll component inter is available >> [Metropolis-01:24491] coll:find_available: querying coll component self >> [Metropolis-01:24491] coll:find_available: coll component self is available >> [Metropolis-01:24492] hwloc:base:get_nbojbs computed data 0 of NUMANode:0 >> [Metropolis-01:24491] hwloc:base:get_nbojbs computed data 0 of NUMANode:0 >> [Metropolis-01:24491] coll:base:comm_select: new communicator: >> MPI_COMM_WORLD (cid 0) >> [Metropolis-01:24491] coll:base:comm_select: Checking all available modules >> [Metropolis-01:24491] coll:tuned:module_tuned query called >> [Metropolis-01:24491] coll:base:comm_select: component available: tuned, >> priority: 30 >> [Metropolis-01:24491] coll:base:comm_select: component available: libnbc, >> priority: 10 >> [Metropolis-01:24491] coll:base:comm_select: component not available: >> hierarch >> [Metropolis-01:24491] coll:base:comm_select: component available: basic, >> priority: 10 >> [Metropolis-01:24491] coll:base:comm_select: component not available: inter >> [Metropolis-01:24491] coll:base:comm_select: component not available: self >> [Metropolis-01:24491] coll:tuned:module_init called. >> [Metropolis-01:24491] coll:tuned:module_init Tuned is in use >> [Metropolis-01:24491] coll:base:comm_select: new communicator: MPI_COMM_SELF >> (cid 1) >> [Metropolis-01:24491] coll:base:comm_select: Checking all available modules >> [Metropolis-01:24491] coll:tuned:module_tuned query called >> [Metropolis-01:24491] coll:base:comm_select: component not available: tuned >> [Metropolis-01:24491] coll:base:comm_select: component available: libnbc, >> priority: 10 >> [Metropolis-01:24491] coll:base:comm_select: component not available: >> hierarch >> [Metropolis-01:24491] coll:base:comm_select: component available: basic, >> priority: 10 >> [Metropolis-01:24491] coll:base:comm_select: component not available: inter >> [Metropolis-01:24491] coll:base:comm_select: component available: self, >> priority: 75 >> [Metropolis-01:24492] coll:base:comm_select: new communicator: >> MPI_COMM_WORLD (cid 0) >> [Metropolis-01:24492] coll:base:comm_select: Checking all available modules >> [Metropolis-01:24492] coll:tuned:module_tuned query called >> [Metropolis-01:24492] coll:base:comm_select: component available: tuned, >> priority: 30 >> [Metropolis-01:24492] coll:base:comm_select: component available: libnbc, >> priority: 10 >> [Metropolis-01:24492] coll:base:comm_select: component not available: >> hierarch >> [Metropolis-01:24492] coll:base:comm_select: component available: basic, >> priority: 10 >> [Metropolis-01:24492] coll:base:comm_select: component not available: inter >> [Metropolis-01:24492] coll:base:comm_select: component not available: self >> [Metropolis-01:24492] coll:tuned:module_init called. >> [Metropolis-01:24492] coll:tuned:module_init Tuned is in use >> [Metropolis-01:24492] coll:base:comm_select: new communicator: MPI_COMM_SELF >> (cid 1) >> [Metropolis-01:24492] coll:base:comm_select: Checking all available modules >> [Metropolis-01:24492] coll:tuned:module_tuned query called >> [Metropolis-01:24492] coll:base:comm_select: component not available: tuned >> [Metropolis-01:24492] coll:base:comm_select: component available: libnbc, >> priority: 10 >> [Metropolis-01:24492] coll:base:comm_select: component not available: >> hierarch >> [Metropolis-01:24492] coll:base:comm_select: component available: basic, >> priority: 10 >> [Metropolis-01:24492] coll:base:comm_select: component not available: inter >> [Metropolis-01:24492] coll:base:comm_select: component available: self, >> priority: 75 >> [Metropolis-01:24491] coll:tuned:component_close: called >> [Metropolis-01:24491] coll:tuned:component_close: done! >> [Metropolis-01:24492] coll:tuned:component_close: called >> [Metropolis-01:24492] coll:tuned:component_close: done! >> [Metropolis-01:24492] mca: base: close: component tuned closed >> [Metropolis-01:24492] mca: base: close: unloading component tuned >> [Metropolis-01:24492] mca: base: close: component libnbc closed >> [Metropolis-01:24492] mca: base: close: unloading component libnbc >> [Metropolis-01:24492] mca: base: close: unloading component hierarch >> [Metropolis-01:24492] mca: base: close: unloading component basic >> [Metropolis-01:24492] mca: base: close: unloading component inter >> [Metropolis-01:24492] mca: base: close: unloading component self >> [Metropolis-01:24491] mca: base: close: component tuned closed >> [Metropolis-01:24491] mca: base: close: unloading component tuned >> [Metropolis-01:24491] mca: base: close: component libnbc closed >> [Metropolis-01:24491] mca: base: close: unloading component libnbc >> [Metropolis-01:24491] mca: base: close: unloading component hierarch >> [Metropolis-01:24491] mca: base: close: unloading component basic >> [Metropolis-01:24491] mca: base: close: unloading component inter >> [Metropolis-01:24491] mca: base: close: unloading component self >> [jarico@Metropolis-01 examples]$ >> >> >> SM is not load because it detects no other processes in the same machine: >> >> [Metropolis-01:24491] coll:sm:init_query: no other local procs; >> disqualifying myself >> >> The machine is a multicore machine with 8 cores. >> >> I need to run SM component code, and I suppose that raising priority it will >> be the component selected when problem is solved. >> >> >> >> El 03/07/2012, a las 21:01, Jeff Squyres escribió: >> >>> The issue is that the "sm" coll component only implements a few of the MPI >>> collective operations. It is usually mixed at run-time with other coll >>> components to fill out the rest of the MPI collective operations. >>> >>> So what is happening is that OMPI is determining that it doesn't have >>> implementations of all the MPI collective operations and aborting. >>> >>> You shouldn't need to manually select your coll module -- OMPI should >>> automatically select the right collective module for you. E.g., if all >>> procs are local on a single machine and sm has a matching implementation >>> for that MPI collective operation, it'll be used. >>> >>> >>> >>> On Jul 3, 2012, at 2:48 PM, Juan Antonio Rico Gallego wrote: >>> >>>> Output is: >>>> >>>> [Metropolis-01:15355] hwloc:base:get_topology >>>> [Metropolis-01:15355] hwloc:base: no cpus specified - using root available >>>> cpuset >>>> >>>> ======================== JOB MAP ======================== >>>> >>>> Data for node: Metropolis-01 Num procs: 2 >>>> Process OMPI jobid: [59809,1] App: 0 Process rank: 0 >>>> Process OMPI jobid: [59809,1] App: 0 Process rank: 1 >>>> >>>> ============================================================= >>>> [Metropolis-01:15356] locality: CL:CU:N:B >>>> [Metropolis-01:15356] hwloc:base: get available cpus >>>> [Metropolis-01:15356] hwloc:base:get_available_cpus first time - filtering >>>> cpus >>>> [Metropolis-01:15356] hwloc:base: no cpus specified - using root available >>>> cpuset >>>> [Metropolis-01:15356] hwloc:base:get_available_cpus root object >>>> [Metropolis-01:15357] locality: CL:CU:N:B >>>> [Metropolis-01:15357] hwloc:base: get available cpus >>>> [Metropolis-01:15357] hwloc:base:get_available_cpus first time - filtering >>>> cpus >>>> [Metropolis-01:15357] hwloc:base: no cpus specified - using root available >>>> cpuset >>>> [Metropolis-01:15357] hwloc:base:get_available_cpus root object >>>> [Metropolis-01:15356] hwloc:base:get_nbojbs computed data 0 of NUMANode:0 >>>> [Metropolis-01:15357] hwloc:base:get_nbojbs computed data 0 of NUMANode:0 >>>> >>>> >>>> Regards, >>>> Juan A. Rico >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> >>> -- >>> Jeff Squyres >>> jsquy...@cisco.com >>> For corporate legal information go to: >>> http://www.cisco.com/web/about/doing_business/legal/cri/ >>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel