Okay, please try this again with r26739 or above. You can remove the rest of the "verbose" settings and the --display-map so we declutter the output. Please add "-mca orte_nidmap_verbose 20" to your cmd line.
Thanks! Ralph On Tue, Jul 3, 2012 at 1:50 PM, Juan A. Rico <jar...@unex.es> wrote: > Here is the output. > > [jarico@Metropolis-01 examples]$ > /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --bind-to-core > --bynode --mca mca_base_verbose 100 --mca mca_coll_base_output 100 --mca > coll_sm_priority 99 -mca hwloc_base_verbose 90 --display-map --mca > mca_verbose 100 --mca mca_base_verbose 100 --mca coll_base_verbose 100 -n 2 > -mca grpcomm_base_verbose 5 ./bmem > [Metropolis-01:24563] mca: base: components_open: Looking for hwloc > components > [Metropolis-01:24563] mca: base: components_open: opening hwloc components > [Metropolis-01:24563] mca: base: components_open: found loaded component > hwloc142 > [Metropolis-01:24563] mca: base: components_open: component hwloc142 has > no register function > [Metropolis-01:24563] mca: base: components_open: component hwloc142 has > no open function > [Metropolis-01:24563] hwloc:base:get_topology > [Metropolis-01:24563] hwloc:base: no cpus specified - using root available > cpuset > [Metropolis-01:24563] mca:base:select:(grpcomm) Querying component [bad] > [Metropolis-01:24563] mca:base:select:(grpcomm) Query of component [bad] > set priority to 10 > [Metropolis-01:24563] mca:base:select:(grpcomm) Selected component [bad] > [Metropolis-01:24563] [[36265,0],0] grpcomm:base:receive start comm > -------------------------------------------------------------------------- > WARNING: a request was made to bind a process. While the system > supports binding the process itself, at least one node does NOT > support binding memory to the process location. > > Node: Metropolis-01 > > This is a warning only; your job will continue, though performance may > be degraded. > -------------------------------------------------------------------------- > [Metropolis-01:24563] hwloc:base: get available cpus > [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done > [Metropolis-01:24563] hwloc:base: get available cpus > [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done > [Metropolis-01:24563] hwloc:base: get available cpus > [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done > [Metropolis-01:24563] hwloc:base: get available cpus > [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done > [Metropolis-01:24563] hwloc:base: get available cpus > [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done > [Metropolis-01:24563] hwloc:base: get available cpus > [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done > [Metropolis-01:24563] hwloc:base: get available cpus > [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done > [Metropolis-01:24563] hwloc:base: get available cpus > [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done > [Metropolis-01:24563] hwloc:base:get_nbojbs computed data 8 of Core:0 > [Metropolis-01:24563] hwloc:base: get available cpus > [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done > [Metropolis-01:24563] hwloc:base: get available cpus > [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done > > ======================== JOB MAP ======================== > > Data for node: Metropolis-01 Num procs: 2 > Process OMPI jobid: [36265,1] App: 0 Process rank: 0 > Process OMPI jobid: [36265,1] App: 0 Process rank: 1 > > ============================================================= > [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job > [36265,0] tag 1 > [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay > [Metropolis-01:24563] [[36265,0],0] grpcomm:base:xcast updating daemon > nidmap > [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient > list is empty! > [Metropolis-01:24564] mca: base: components_open: Looking for hwloc > components > [Metropolis-01:24564] mca: base: components_open: opening hwloc components > [Metropolis-01:24564] mca: base: components_open: found loaded component > hwloc142 > [Metropolis-01:24564] mca: base: components_open: component hwloc142 has > no register function > [Metropolis-01:24564] mca: base: components_open: component hwloc142 has > no open function > [Metropolis-01:24565] mca: base: components_open: Looking for hwloc > components > [Metropolis-01:24565] mca: base: components_open: opening hwloc components > [Metropolis-01:24565] mca: base: components_open: found loaded component > hwloc142 > [Metropolis-01:24565] mca: base: components_open: component hwloc142 has > no register function > [Metropolis-01:24565] mca: base: components_open: component hwloc142 has > no open function > [Metropolis-01:24564] mca:base:select:(grpcomm) Querying component [bad] > [Metropolis-01:24564] mca:base:select:(grpcomm) Query of component [bad] > set priority to 10 > [Metropolis-01:24564] mca:base:select:(grpcomm) Selected component [bad] > [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive start comm > [Metropolis-01:24564] computing locality - getting object at level CORE, > index 0 > [Metropolis-01:24564] hwloc:base: get available cpus > [Metropolis-01:24564] hwloc:base:get_available_cpus first time - filtering > cpus > [Metropolis-01:24564] hwloc:base: no cpus specified - using root available > cpuset > [Metropolis-01:24564] computing locality - getting object at level CORE, > index 1 > [Metropolis-01:24564] hwloc:base: get available cpus > [Metropolis-01:24564] hwloc:base:filter_cpus specified - already done > [Metropolis-01:24564] computing locality - shifting up from L1CACHE > [Metropolis-01:24564] computing locality - shifting up from L2CACHE > [Metropolis-01:24564] computing locality - shifting up from L3CACHE > [Metropolis-01:24564] computing locality - filling level SOCKET > [Metropolis-01:24564] computing locality - filling level NUMA > [Metropolis-01:24564] locality: CL:CU:N:B:Nu:S > [Metropolis-01:24565] mca:base:select:(grpcomm) Querying component [bad] > [Metropolis-01:24565] mca:base:select:(grpcomm) Query of component [bad] > set priority to 10 > [Metropolis-01:24565] mca:base:select:(grpcomm) Selected component [bad] > [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive start comm > [Metropolis-01:24564] mca: base: components_open: Looking for coll > components > [Metropolis-01:24564] mca: base: components_open: opening coll components > [Metropolis-01:24564] mca: base: components_open: found loaded component > tuned > [Metropolis-01:24564] mca: base: components_open: component tuned has no > register function > [Metropolis-01:24564] coll:tuned:component_open: done! > [Metropolis-01:24564] mca: base: components_open: component tuned open > function successful > [Metropolis-01:24564] mca: base: components_open: found loaded component sm > [Metropolis-01:24564] mca: base: components_open: component sm register > function successful > [Metropolis-01:24564] mca: base: components_open: component sm has no open > function > [Metropolis-01:24564] mca: base: components_open: found loaded component > libnbc > [Metropolis-01:24564] mca: base: components_open: component libnbc > register function successful > [Metropolis-01:24564] mca: base: components_open: component libnbc open > function successful > [Metropolis-01:24564] mca: base: components_open: found loaded component > hierarch > [Metropolis-01:24564] mca: base: components_open: component hierarch has > no register function > [Metropolis-01:24564] mca: base: components_open: component hierarch open > function successful > [Metropolis-01:24564] mca: base: components_open: found loaded component > basic > [Metropolis-01:24564] mca: base: components_open: component basic register > function successful > [Metropolis-01:24564] mca: base: components_open: component basic has no > open function > [Metropolis-01:24564] mca: base: components_open: found loaded component > inter > [Metropolis-01:24564] mca: base: components_open: component inter has no > register function > [Metropolis-01:24564] mca: base: components_open: component inter open > function successful > [Metropolis-01:24564] mca: base: components_open: found loaded component > self > [Metropolis-01:24564] mca: base: components_open: component self has no > register function > [Metropolis-01:24564] mca: base: components_open: component self open > function successful > [Metropolis-01:24565] computing locality - getting object at level CORE, > index 1 > [Metropolis-01:24565] hwloc:base: get available cpus > [Metropolis-01:24565] hwloc:base:get_available_cpus first time - filtering > cpus > [Metropolis-01:24565] hwloc:base: no cpus specified - using root available > cpuset > [Metropolis-01:24565] hwloc:base: get available cpus > [Metropolis-01:24565] hwloc:base:filter_cpus specified - already done > [Metropolis-01:24565] computing locality - getting object at level CORE, > index 0 > [Metropolis-01:24565] computing locality - shifting up from L1CACHE > [Metropolis-01:24565] computing locality - shifting up from L2CACHE > [Metropolis-01:24565] computing locality - shifting up from L3CACHE > [Metropolis-01:24565] computing locality - filling level SOCKET > [Metropolis-01:24565] computing locality - filling level NUMA > [Metropolis-01:24565] locality: CL:CU:N:B:Nu:S > [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0] > [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0 > [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO > PARTICIPANTS > [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0 > [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0 > [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2 > [Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: performing modex > [Metropolis-01:24564] [[36265,1],0] grpcomm:base:pack_modex: reporting 4 > entries > [Metropolis-01:24564] [[36265,1],0] grpcomm:base:full:modex: executing > allgather > [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering allgather > [Metropolis-01:24564] [[36265,1],0] grpcomm:bad allgather underway > [Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: modex posted > [Metropolis-01:24565] mca: base: components_open: Looking for coll > components > [Metropolis-01:24565] mca: base: components_open: opening coll components > [Metropolis-01:24565] mca: base: components_open: found loaded component > tuned > [Metropolis-01:24565] mca: base: components_open: component tuned has no > register function > [Metropolis-01:24565] coll:tuned:component_open: done! > [Metropolis-01:24565] mca: base: components_open: component tuned open > function successful > [Metropolis-01:24565] mca: base: components_open: found loaded component sm > [Metropolis-01:24565] mca: base: components_open: component sm register > function successful > [Metropolis-01:24565] mca: base: components_open: component sm has no open > function > [Metropolis-01:24565] mca: base: components_open: found loaded component > libnbc > [Metropolis-01:24565] mca: base: components_open: component libnbc > register function successful > [Metropolis-01:24565] mca: base: components_open: component libnbc open > function successful > [Metropolis-01:24565] mca: base: components_open: found loaded component > hierarch > [Metropolis-01:24565] mca: base: components_open: component hierarch has > no register function > [Metropolis-01:24565] mca: base: components_open: component hierarch open > function successful > [Metropolis-01:24565] mca: base: components_open: found loaded component > basic > [Metropolis-01:24565] mca: base: components_open: component basic register > function successful > [Metropolis-01:24565] mca: base: components_open: component basic has no > open function > [Metropolis-01:24565] mca: base: components_open: found loaded component > inter > [Metropolis-01:24565] mca: base: components_open: component inter has no > register function > [Metropolis-01:24565] mca: base: components_open: component inter open > function successful > [Metropolis-01:24565] mca: base: components_open: found loaded component > self > [Metropolis-01:24565] mca: base: components_open: component self has no > register function > [Metropolis-01:24565] mca: base: components_open: component self open > function successful > [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1] > [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0 > [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0 > [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0 > [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2 > [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 0 LOCALLY COMPLETE - > SENDING TO GLOBAL COLLECTIVE > [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon > collective recvd from [[36265,0],0] > [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING > COLLECTIVE 0 > [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM > CONTRIBS: 2 > [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job > [36265,1] tag 30 > [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay > [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient > list is empty! > [Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: performing modex > [Metropolis-01:24565] [[36265,1],1] grpcomm:base:pack_modex: reporting 4 > entries > [Metropolis-01:24565] [[36265,1],1] grpcomm:base:full:modex: executing > allgather > [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering allgather > [Metropolis-01:24565] [[36265,1],1] grpcomm:bad allgather underway > [Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: modex posted > [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing > collective return for id 0 > [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 0 > [Metropolis-01:24564] [[36265,1],0] STORING MODEX DATA > [Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex > entry for proc [[36265,1],0] > [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing > collective return for id 0 > [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 0 > [Metropolis-01:24565] [[36265,1],1] STORING MODEX DATA > [Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex > entry for proc [[36265,1],0] > [Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries: > adding 4 entries for proc [[36265,1],0] > [Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex > entry for proc [[36265,1],1] > [Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries: > adding 4 entries for proc [[36265,1],1] > [Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries: > adding 4 entries for proc [[36265,1],0] > [Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex > entry for proc [[36265,1],1] > [Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries: > adding 4 entries for proc [[36265,1],1] > [Metropolis-01:24564] coll:find_available: querying coll component tuned > [Metropolis-01:24564] coll:find_available: coll component tuned is > available > [Metropolis-01:24565] coll:find_available: querying coll component tuned > [Metropolis-01:24565] coll:find_available: coll component tuned is > available > [Metropolis-01:24565] coll:find_available: querying coll component sm > [Metropolis-01:24564] coll:find_available: querying coll component sm > [Metropolis-01:24564] coll:sm:init_query: no other local procs; > disqualifying myself > [Metropolis-01:24564] coll:find_available: coll component sm is not > available > [Metropolis-01:24564] coll:find_available: querying coll component libnbc > [Metropolis-01:24564] coll:find_available: coll component libnbc is > available > [Metropolis-01:24564] coll:find_available: querying coll component hierarch > [Metropolis-01:24564] coll:find_available: coll component hierarch is > available > [Metropolis-01:24564] coll:find_available: querying coll component basic > [Metropolis-01:24564] coll:find_available: coll component basic is > available > [Metropolis-01:24565] coll:sm:init_query: no other local procs; > disqualifying myself > [Metropolis-01:24565] coll:find_available: coll component sm is not > available > [Metropolis-01:24565] coll:find_available: querying coll component libnbc > [Metropolis-01:24565] coll:find_available: coll component libnbc is > available > [Metropolis-01:24565] coll:find_available: querying coll component hierarch > [Metropolis-01:24565] coll:find_available: coll component hierarch is > available > [Metropolis-01:24565] coll:find_available: querying coll component basic > [Metropolis-01:24565] coll:find_available: coll component basic is > available > [Metropolis-01:24564] coll:find_available: querying coll component inter > [Metropolis-01:24564] coll:find_available: coll component inter is > available > [Metropolis-01:24564] coll:find_available: querying coll component self > [Metropolis-01:24564] coll:find_available: coll component self is available > [Metropolis-01:24565] coll:find_available: querying coll component inter > [Metropolis-01:24565] coll:find_available: coll component inter is > available > [Metropolis-01:24565] coll:find_available: querying coll component self > [Metropolis-01:24565] coll:find_available: coll component self is available > [Metropolis-01:24565] hwloc:base:get_nbojbs computed data 0 of NUMANode:0 > [Metropolis-01:24564] hwloc:base:get_nbojbs computed data 0 of NUMANode:0 > [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1] > [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1 > [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO > PARTICIPANTS > [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1 > [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1 > [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2 > [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0] > [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1 > [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1 > [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1 > [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2 > [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 1 LOCALLY COMPLETE - > SENDING TO GLOBAL COLLECTIVE > [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon > collective recvd from [[36265,0],0] > [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING > COLLECTIVE 1 > [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM > CONTRIBS: 2 > [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job > [36265,1] tag 30 > [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay > [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient > list is empty! > [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier > [Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway > [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier > [Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway > [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing > collective return for id 1 > [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 1 > [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing > collective return for id 1 > [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 1 > [Metropolis-01:24565] coll:base:comm_select: new communicator: > MPI_COMM_WORLD (cid 0) > [Metropolis-01:24565] coll:base:comm_select: Checking all available modules > [Metropolis-01:24565] coll:tuned:module_tuned query called > [Metropolis-01:24565] coll:base:comm_select: component available: tuned, > priority: 30 > [Metropolis-01:24565] coll:base:comm_select: component available: libnbc, > priority: 10 > [Metropolis-01:24565] coll:base:comm_select: component not available: > hierarch > [Metropolis-01:24565] coll:base:comm_select: component available: basic, > priority: 10 > [Metropolis-01:24565] coll:base:comm_select: component not available: inter > [Metropolis-01:24565] coll:base:comm_select: component not available: self > [Metropolis-01:24565] coll:tuned:module_init called. > [Metropolis-01:24565] coll:tuned:module_init Tuned is in use > [Metropolis-01:24565] coll:base:comm_select: new communicator: > MPI_COMM_SELF (cid 1) > [Metropolis-01:24565] coll:base:comm_select: Checking all available modules > [Metropolis-01:24564] coll:base:comm_select: new communicator: > MPI_COMM_WORLD (cid 0) > [Metropolis-01:24564] coll:base:comm_select: Checking all available modules > [Metropolis-01:24564] coll:tuned:module_tuned query called > [Metropolis-01:24564] coll:base:comm_select: component available: tuned, > priority: 30 > [Metropolis-01:24564] coll:base:comm_select: component available: libnbc, > priority: 10 > [Metropolis-01:24564] coll:base:comm_select: component not available: > hierarch > [Metropolis-01:24564] coll:base:comm_select: component available: basic, > priority: 10 > [Metropolis-01:24564] coll:base:comm_select: component not available: inter > [Metropolis-01:24564] coll:base:comm_select: component not available: self > [Metropolis-01:24564] coll:tuned:module_init called. > [Metropolis-01:24565] coll:tuned:module_tuned query called > [Metropolis-01:24565] coll:base:comm_select: component not available: tuned > [Metropolis-01:24565] coll:base:comm_select: component available: libnbc, > priority: 10 > [Metropolis-01:24565] coll:base:comm_select: component not available: > hierarch > [Metropolis-01:24565] coll:base:comm_select: component available: basic, > priority: 10 > [Metropolis-01:24565] coll:base:comm_select: component not available: inter > [Metropolis-01:24565] coll:base:comm_select: component available: self, > priority: 75 > [Metropolis-01:24564] coll:tuned:module_init Tuned is in use > [Metropolis-01:24564] coll:base:comm_select: new communicator: > MPI_COMM_SELF (cid 1) > [Metropolis-01:24564] coll:base:comm_select: Checking all available modules > [Metropolis-01:24564] coll:tuned:module_tuned query called > [Metropolis-01:24564] coll:base:comm_select: component not available: tuned > [Metropolis-01:24564] coll:base:comm_select: component available: libnbc, > priority: 10 > [Metropolis-01:24564] coll:base:comm_select: component not available: > hierarch > [Metropolis-01:24564] coll:base:comm_select: component available: basic, > priority: 10 > [Metropolis-01:24564] coll:base:comm_select: component not available: inter > [Metropolis-01:24564] coll:base:comm_select: component available: self, > priority: 75 > [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier > [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1] > [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2 > [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO > PARTICIPANTS > [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2 > [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2 > [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2 > [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0] > [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2 > [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2 > [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2 > [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2 > [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 2 LOCALLY COMPLETE - > SENDING TO GLOBAL COLLECTIVE > [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon > collective recvd from [[36265,0],0] > [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING > COLLECTIVE 2 > [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM > CONTRIBS: 2 > [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job > [36265,1] tag 30 > [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay > [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient > list is empty! > [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier > [Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway > [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing > collective return for id 2 > [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 2 > [Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway > [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing > collective return for id 2 > [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 2 > [Metropolis-01:24565] coll:tuned:component_close: called > [Metropolis-01:24565] coll:tuned:component_close: done! > [Metropolis-01:24565] mca: base: close: component tuned closed > [Metropolis-01:24565] mca: base: close: unloading component tuned > [Metropolis-01:24565] mca: base: close: component libnbc closed > [Metropolis-01:24565] mca: base: close: unloading component libnbc > [Metropolis-01:24565] mca: base: close: unloading component hierarch > [Metropolis-01:24565] mca: base: close: unloading component basic > [Metropolis-01:24565] mca: base: close: unloading component inter > [Metropolis-01:24565] mca: base: close: unloading component self > [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive stop comm > [Metropolis-01:24564] coll:tuned:component_close: called > [Metropolis-01:24564] coll:tuned:component_close: done! > [Metropolis-01:24564] mca: base: close: component tuned closed > [Metropolis-01:24564] mca: base: close: unloading component tuned > [Metropolis-01:24564] mca: base: close: component libnbc closed > [Metropolis-01:24564] mca: base: close: unloading component libnbc > [Metropolis-01:24564] mca: base: close: unloading component hierarch > [Metropolis-01:24564] mca: base: close: unloading component basic > [Metropolis-01:24564] mca: base: close: unloading component inter > [Metropolis-01:24564] mca: base: close: unloading component self > [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive stop comm > [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job > [36265,0] tag 1 > [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay > [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient > list is empty! > [jarico@Metropolis-01 examples]$ > > > > El 03/07/2012, a las 21:44, Ralph Castain escribió: > > > Interesting - yes, coll sm doesn't think they are on the same node for > some reason. Try adding -mca grpcomm_base_verbose 5 and let's see why > > > > > > On Jul 3, 2012, at 1:24 PM, Juan Antonio Rico Gallego wrote: > > > >> The code I run is a simple broadcast. > >> > >> When I do not specify components to run, the output is (more verbose): > >> > >> [jarico@Metropolis-01 examples]$ > /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --mca > mca_base_verbose 100 --mca mca_coll_base_output 100 --mca coll_sm_priority > 99 -mca hwloc_base_verbose 90 --display-map --mca mca_verbose 100 --mca > mca_base_verbose 100 --mca coll_base_verbose 100 -n 2 ./bmem > >> [Metropolis-01:24490] mca: base: components_open: Looking for hwloc > components > >> [Metropolis-01:24490] mca: base: components_open: opening hwloc > components > >> [Metropolis-01:24490] mca: base: components_open: found loaded > component hwloc142 > >> [Metropolis-01:24490] mca: base: components_open: component hwloc142 > has no register function > >> [Metropolis-01:24490] mca: base: components_open: component hwloc142 > has no open function > >> [Metropolis-01:24490] hwloc:base:get_topology > >> [Metropolis-01:24490] hwloc:base: no cpus specified - using root > available cpuset > >> > >> ======================== JOB MAP ======================== > >> > >> Data for node: Metropolis-01 Num procs: 2 > >> Process OMPI jobid: [36336,1] App: 0 Process rank: 0 > >> Process OMPI jobid: [36336,1] App: 0 Process rank: 1 > >> > >> ============================================================= > >> [Metropolis-01:24491] mca: base: components_open: Looking for hwloc > components > >> [Metropolis-01:24491] mca: base: components_open: opening hwloc > components > >> [Metropolis-01:24491] mca: base: components_open: found loaded > component hwloc142 > >> [Metropolis-01:24491] mca: base: components_open: component hwloc142 > has no register function > >> [Metropolis-01:24491] mca: base: components_open: component hwloc142 > has no open function > >> [Metropolis-01:24492] mca: base: components_open: Looking for hwloc > components > >> [Metropolis-01:24492] mca: base: components_open: opening hwloc > components > >> [Metropolis-01:24492] mca: base: components_open: found loaded > component hwloc142 > >> [Metropolis-01:24492] mca: base: components_open: component hwloc142 > has no register function > >> [Metropolis-01:24492] mca: base: components_open: component hwloc142 > has no open function > >> [Metropolis-01:24491] locality: CL:CU:N:B > >> [Metropolis-01:24491] hwloc:base: get available cpus > >> [Metropolis-01:24491] hwloc:base:get_available_cpus first time - > filtering cpus > >> [Metropolis-01:24491] hwloc:base: no cpus specified - using root > available cpuset > >> [Metropolis-01:24491] hwloc:base:get_available_cpus root object > >> [Metropolis-01:24491] mca: base: components_open: Looking for coll > components > >> [Metropolis-01:24491] mca: base: components_open: opening coll > components > >> [Metropolis-01:24491] mca: base: components_open: found loaded > component tuned > >> [Metropolis-01:24491] mca: base: components_open: component tuned has > no register function > >> [Metropolis-01:24491] coll:tuned:component_open: done! > >> [Metropolis-01:24491] mca: base: components_open: component tuned open > function successful > >> [Metropolis-01:24491] mca: base: components_open: found loaded > component sm > >> [Metropolis-01:24491] mca: base: components_open: component sm register > function successful > >> [Metropolis-01:24491] mca: base: components_open: component sm has no > open function > >> [Metropolis-01:24491] mca: base: components_open: found loaded > component libnbc > >> [Metropolis-01:24491] mca: base: components_open: component libnbc > register function successful > >> [Metropolis-01:24491] mca: base: components_open: component libnbc open > function successful > >> [Metropolis-01:24491] mca: base: components_open: found loaded > component hierarch > >> [Metropolis-01:24491] mca: base: components_open: component hierarch > has no register function > >> [Metropolis-01:24491] mca: base: components_open: component hierarch > open function successful > >> [Metropolis-01:24491] mca: base: components_open: found loaded > component basic > >> [Metropolis-01:24491] mca: base: components_open: component basic > register function successful > >> [Metropolis-01:24491] mca: base: components_open: component basic has > no open function > >> [Metropolis-01:24491] mca: base: components_open: found loaded > component inter > >> [Metropolis-01:24491] mca: base: components_open: component inter has > no register function > >> [Metropolis-01:24491] mca: base: components_open: component inter open > function successful > >> [Metropolis-01:24491] mca: base: components_open: found loaded > component self > >> [Metropolis-01:24491] mca: base: components_open: component self has no > register function > >> [Metropolis-01:24491] mca: base: components_open: component self open > function successful > >> [Metropolis-01:24492] locality: CL:CU:N:B > >> [Metropolis-01:24492] hwloc:base: get available cpus > >> [Metropolis-01:24492] hwloc:base:get_available_cpus first time - > filtering cpus > >> [Metropolis-01:24492] hwloc:base: no cpus specified - using root > available cpuset > >> [Metropolis-01:24492] hwloc:base:get_available_cpus root object > >> [Metropolis-01:24492] mca: base: components_open: Looking for coll > components > >> [Metropolis-01:24492] mca: base: components_open: opening coll > components > >> [Metropolis-01:24492] mca: base: components_open: found loaded > component tuned > >> [Metropolis-01:24492] mca: base: components_open: component tuned has > no register function > >> [Metropolis-01:24492] coll:tuned:component_open: done! > >> [Metropolis-01:24492] mca: base: components_open: component tuned open > function successful > >> [Metropolis-01:24492] mca: base: components_open: found loaded > component sm > >> [Metropolis-01:24492] mca: base: components_open: component sm register > function successful > >> [Metropolis-01:24492] mca: base: components_open: component sm has no > open function > >> [Metropolis-01:24492] mca: base: components_open: found loaded > component libnbc > >> [Metropolis-01:24492] mca: base: components_open: component libnbc > register function successful > >> [Metropolis-01:24492] mca: base: components_open: component libnbc open > function successful > >> [Metropolis-01:24492] mca: base: components_open: found loaded > component hierarch > >> [Metropolis-01:24492] mca: base: components_open: component hierarch > has no register function > >> [Metropolis-01:24492] mca: base: components_open: component hierarch > open function successful > >> [Metropolis-01:24492] mca: base: components_open: found loaded > component basic > >> [Metropolis-01:24492] mca: base: components_open: component basic > register function successful > >> [Metropolis-01:24492] mca: base: components_open: component basic has > no open function > >> [Metropolis-01:24492] mca: base: components_open: found loaded > component inter > >> [Metropolis-01:24492] mca: base: components_open: component inter has > no register function > >> [Metropolis-01:24492] mca: base: components_open: component inter open > function successful > >> [Metropolis-01:24492] mca: base: components_open: found loaded > component self > >> [Metropolis-01:24492] mca: base: components_open: component self has no > register function > >> [Metropolis-01:24492] mca: base: components_open: component self open > function successful > >> [Metropolis-01:24491] coll:find_available: querying coll component tuned > >> [Metropolis-01:24491] coll:find_available: coll component tuned is > available > >> [Metropolis-01:24491] coll:find_available: querying coll component sm > >> [Metropolis-01:24491] coll:sm:init_query: no other local procs; > disqualifying myself > >> [Metropolis-01:24491] coll:find_available: coll component sm is not > available > >> [Metropolis-01:24491] coll:find_available: querying coll component > libnbc > >> [Metropolis-01:24491] coll:find_available: coll component libnbc is > available > >> [Metropolis-01:24491] coll:find_available: querying coll component > hierarch > >> [Metropolis-01:24491] coll:find_available: coll component hierarch is > available > >> [Metropolis-01:24491] coll:find_available: querying coll component basic > >> [Metropolis-01:24491] coll:find_available: coll component basic is > available > >> [Metropolis-01:24491] coll:find_available: querying coll component inter > >> [Metropolis-01:24492] coll:find_available: querying coll component tuned > >> [Metropolis-01:24492] coll:find_available: coll component tuned is > available > >> [Metropolis-01:24492] coll:find_available: querying coll component sm > >> [Metropolis-01:24492] coll:sm:init_query: no other local procs; > disqualifying myself > >> [Metropolis-01:24492] coll:find_available: coll component sm is not > available > >> [Metropolis-01:24492] coll:find_available: querying coll component > libnbc > >> [Metropolis-01:24492] coll:find_available: coll component libnbc is > available > >> [Metropolis-01:24492] coll:find_available: querying coll component > hierarch > >> [Metropolis-01:24492] coll:find_available: coll component hierarch is > available > >> [Metropolis-01:24492] coll:find_available: querying coll component basic > >> [Metropolis-01:24492] coll:find_available: coll component basic is > available > >> [Metropolis-01:24492] coll:find_available: querying coll component inter > >> [Metropolis-01:24492] coll:find_available: coll component inter is > available > >> [Metropolis-01:24492] coll:find_available: querying coll component self > >> [Metropolis-01:24492] coll:find_available: coll component self is > available > >> [Metropolis-01:24491] coll:find_available: coll component inter is > available > >> [Metropolis-01:24491] coll:find_available: querying coll component self > >> [Metropolis-01:24491] coll:find_available: coll component self is > available > >> [Metropolis-01:24492] hwloc:base:get_nbojbs computed data 0 of > NUMANode:0 > >> [Metropolis-01:24491] hwloc:base:get_nbojbs computed data 0 of > NUMANode:0 > >> [Metropolis-01:24491] coll:base:comm_select: new communicator: > MPI_COMM_WORLD (cid 0) > >> [Metropolis-01:24491] coll:base:comm_select: Checking all available > modules > >> [Metropolis-01:24491] coll:tuned:module_tuned query called > >> [Metropolis-01:24491] coll:base:comm_select: component available: > tuned, priority: 30 > >> [Metropolis-01:24491] coll:base:comm_select: component available: > libnbc, priority: 10 > >> [Metropolis-01:24491] coll:base:comm_select: component not available: > hierarch > >> [Metropolis-01:24491] coll:base:comm_select: component available: > basic, priority: 10 > >> [Metropolis-01:24491] coll:base:comm_select: component not available: > inter > >> [Metropolis-01:24491] coll:base:comm_select: component not available: > self > >> [Metropolis-01:24491] coll:tuned:module_init called. > >> [Metropolis-01:24491] coll:tuned:module_init Tuned is in use > >> [Metropolis-01:24491] coll:base:comm_select: new communicator: > MPI_COMM_SELF (cid 1) > >> [Metropolis-01:24491] coll:base:comm_select: Checking all available > modules > >> [Metropolis-01:24491] coll:tuned:module_tuned query called > >> [Metropolis-01:24491] coll:base:comm_select: component not available: > tuned > >> [Metropolis-01:24491] coll:base:comm_select: component available: > libnbc, priority: 10 > >> [Metropolis-01:24491] coll:base:comm_select: component not available: > hierarch > >> [Metropolis-01:24491] coll:base:comm_select: component available: > basic, priority: 10 > >> [Metropolis-01:24491] coll:base:comm_select: component not available: > inter > >> [Metropolis-01:24491] coll:base:comm_select: component available: self, > priority: 75 > >> [Metropolis-01:24492] coll:base:comm_select: new communicator: > MPI_COMM_WORLD (cid 0) > >> [Metropolis-01:24492] coll:base:comm_select: Checking all available > modules > >> [Metropolis-01:24492] coll:tuned:module_tuned query called > >> [Metropolis-01:24492] coll:base:comm_select: component available: > tuned, priority: 30 > >> [Metropolis-01:24492] coll:base:comm_select: component available: > libnbc, priority: 10 > >> [Metropolis-01:24492] coll:base:comm_select: component not available: > hierarch > >> [Metropolis-01:24492] coll:base:comm_select: component available: > basic, priority: 10 > >> [Metropolis-01:24492] coll:base:comm_select: component not available: > inter > >> [Metropolis-01:24492] coll:base:comm_select: component not available: > self > >> [Metropolis-01:24492] coll:tuned:module_init called. > >> [Metropolis-01:24492] coll:tuned:module_init Tuned is in use > >> [Metropolis-01:24492] coll:base:comm_select: new communicator: > MPI_COMM_SELF (cid 1) > >> [Metropolis-01:24492] coll:base:comm_select: Checking all available > modules > >> [Metropolis-01:24492] coll:tuned:module_tuned query called > >> [Metropolis-01:24492] coll:base:comm_select: component not available: > tuned > >> [Metropolis-01:24492] coll:base:comm_select: component available: > libnbc, priority: 10 > >> [Metropolis-01:24492] coll:base:comm_select: component not available: > hierarch > >> [Metropolis-01:24492] coll:base:comm_select: component available: > basic, priority: 10 > >> [Metropolis-01:24492] coll:base:comm_select: component not available: > inter > >> [Metropolis-01:24492] coll:base:comm_select: component available: self, > priority: 75 > >> [Metropolis-01:24491] coll:tuned:component_close: called > >> [Metropolis-01:24491] coll:tuned:component_close: done! > >> [Metropolis-01:24492] coll:tuned:component_close: called > >> [Metropolis-01:24492] coll:tuned:component_close: done! > >> [Metropolis-01:24492] mca: base: close: component tuned closed > >> [Metropolis-01:24492] mca: base: close: unloading component tuned > >> [Metropolis-01:24492] mca: base: close: component libnbc closed > >> [Metropolis-01:24492] mca: base: close: unloading component libnbc > >> [Metropolis-01:24492] mca: base: close: unloading component hierarch > >> [Metropolis-01:24492] mca: base: close: unloading component basic > >> [Metropolis-01:24492] mca: base: close: unloading component inter > >> [Metropolis-01:24492] mca: base: close: unloading component self > >> [Metropolis-01:24491] mca: base: close: component tuned closed > >> [Metropolis-01:24491] mca: base: close: unloading component tuned > >> [Metropolis-01:24491] mca: base: close: component libnbc closed > >> [Metropolis-01:24491] mca: base: close: unloading component libnbc > >> [Metropolis-01:24491] mca: base: close: unloading component hierarch > >> [Metropolis-01:24491] mca: base: close: unloading component basic > >> [Metropolis-01:24491] mca: base: close: unloading component inter > >> [Metropolis-01:24491] mca: base: close: unloading component self > >> [jarico@Metropolis-01 examples]$ > >> > >> > >> SM is not load because it detects no other processes in the same > machine: > >> > >> [Metropolis-01:24491] coll:sm:init_query: no other local procs; > disqualifying myself > >> > >> The machine is a multicore machine with 8 cores. > >> > >> I need to run SM component code, and I suppose that raising priority it > will be the component selected when problem is solved. > >> > >> > >> > >> El 03/07/2012, a las 21:01, Jeff Squyres escribió: > >> > >>> The issue is that the "sm" coll component only implements a few of the > MPI collective operations. It is usually mixed at run-time with other coll > components to fill out the rest of the MPI collective operations. > >>> > >>> So what is happening is that OMPI is determining that it doesn't have > implementations of all the MPI collective operations and aborting. > >>> > >>> You shouldn't need to manually select your coll module -- OMPI should > automatically select the right collective module for you. E.g., if all > procs are local on a single machine and sm has a matching implementation > for that MPI collective operation, it'll be used. > >>> > >>> > >>> > >>> On Jul 3, 2012, at 2:48 PM, Juan Antonio Rico Gallego wrote: > >>> > >>>> Output is: > >>>> > >>>> [Metropolis-01:15355] hwloc:base:get_topology > >>>> [Metropolis-01:15355] hwloc:base: no cpus specified - using root > available cpuset > >>>> > >>>> ======================== JOB MAP ======================== > >>>> > >>>> Data for node: Metropolis-01 Num procs: 2 > >>>> Process OMPI jobid: [59809,1] App: 0 Process rank: 0 > >>>> Process OMPI jobid: [59809,1] App: 0 Process rank: 1 > >>>> > >>>> ============================================================= > >>>> [Metropolis-01:15356] locality: CL:CU:N:B > >>>> [Metropolis-01:15356] hwloc:base: get available cpus > >>>> [Metropolis-01:15356] hwloc:base:get_available_cpus first time - > filtering cpus > >>>> [Metropolis-01:15356] hwloc:base: no cpus specified - using root > available cpuset > >>>> [Metropolis-01:15356] hwloc:base:get_available_cpus root object > >>>> [Metropolis-01:15357] locality: CL:CU:N:B > >>>> [Metropolis-01:15357] hwloc:base: get available cpus > >>>> [Metropolis-01:15357] hwloc:base:get_available_cpus first time - > filtering cpus > >>>> [Metropolis-01:15357] hwloc:base: no cpus specified - using root > available cpuset > >>>> [Metropolis-01:15357] hwloc:base:get_available_cpus root object > >>>> [Metropolis-01:15356] hwloc:base:get_nbojbs computed data 0 of > NUMANode:0 > >>>> [Metropolis-01:15357] hwloc:base:get_nbojbs computed data 0 of > NUMANode:0 > >>>> > >>>> > >>>> Regards, > >>>> Juan A. Rico > >>>> _______________________________________________ > >>>> devel mailing list > >>>> de...@open-mpi.org > >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >>> > >>> > >>> -- > >>> Jeff Squyres > >>> jsquy...@cisco.com > >>> For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > >>> > >>> > >>> _______________________________________________ > >>> devel mailing list > >>> de...@open-mpi.org > >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel > >> > >> > >> _______________________________________________ > >> devel mailing list > >> de...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > > > _______________________________________________ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >