Rats - no help there. I'll add some debug to the code base tonight that will 
tell us more about what's going on here.


On Jul 3, 2012, at 1:50 PM, Juan A. Rico wrote:

> Here is the output.
> 
> [jarico@Metropolis-01 examples]$ 
> /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --bind-to-core 
> --bynode --mca mca_base_verbose 100 --mca mca_coll_base_output 100  --mca 
> coll_sm_priority 99 -mca hwloc_base_verbose 90 --display-map --mca 
> mca_verbose 100 --mca mca_base_verbose 100 --mca coll_base_verbose 100 -n 2 
> -mca grpcomm_base_verbose 5 ./bmem
> [Metropolis-01:24563] mca: base: components_open: Looking for hwloc components
> [Metropolis-01:24563] mca: base: components_open: opening hwloc components
> [Metropolis-01:24563] mca: base: components_open: found loaded component 
> hwloc142
> [Metropolis-01:24563] mca: base: components_open: component hwloc142 has no 
> register function
> [Metropolis-01:24563] mca: base: components_open: component hwloc142 has no 
> open function
> [Metropolis-01:24563] hwloc:base:get_topology
> [Metropolis-01:24563] hwloc:base: no cpus specified - using root available 
> cpuset
> [Metropolis-01:24563] mca:base:select:(grpcomm) Querying component [bad]
> [Metropolis-01:24563] mca:base:select:(grpcomm) Query of component [bad] set 
> priority to 10
> [Metropolis-01:24563] mca:base:select:(grpcomm) Selected component [bad]
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:receive start comm
> --------------------------------------------------------------------------
> WARNING: a request was made to bind a process. While the system
> supports binding the process itself, at least one node does NOT
> support binding memory to the process location.
> 
>  Node:  Metropolis-01
> 
> This is a warning only; your job will continue, though performance may
> be degraded.
> --------------------------------------------------------------------------
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base:get_nbojbs computed data 8 of Core:0
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24563] hwloc:base: get available cpus
> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
> 
> ========================   JOB MAP   ========================
> 
> Data for node: Metropolis-01  Num procs: 2
>       Process OMPI jobid: [36265,1] App: 0 Process rank: 0
>       Process OMPI jobid: [36265,1] App: 0 Process rank: 1
> 
> =============================================================
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,0] 
> tag 1
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:xcast updating daemon nidmap
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list 
> is empty!
> [Metropolis-01:24564] mca: base: components_open: Looking for hwloc components
> [Metropolis-01:24564] mca: base: components_open: opening hwloc components
> [Metropolis-01:24564] mca: base: components_open: found loaded component 
> hwloc142
> [Metropolis-01:24564] mca: base: components_open: component hwloc142 has no 
> register function
> [Metropolis-01:24564] mca: base: components_open: component hwloc142 has no 
> open function
> [Metropolis-01:24565] mca: base: components_open: Looking for hwloc components
> [Metropolis-01:24565] mca: base: components_open: opening hwloc components
> [Metropolis-01:24565] mca: base: components_open: found loaded component 
> hwloc142
> [Metropolis-01:24565] mca: base: components_open: component hwloc142 has no 
> register function
> [Metropolis-01:24565] mca: base: components_open: component hwloc142 has no 
> open function
> [Metropolis-01:24564] mca:base:select:(grpcomm) Querying component [bad]
> [Metropolis-01:24564] mca:base:select:(grpcomm) Query of component [bad] set 
> priority to 10
> [Metropolis-01:24564] mca:base:select:(grpcomm) Selected component [bad]
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive start comm
> [Metropolis-01:24564] computing locality - getting object at level CORE, 
> index 0
> [Metropolis-01:24564] hwloc:base: get available cpus
> [Metropolis-01:24564] hwloc:base:get_available_cpus first time - filtering 
> cpus
> [Metropolis-01:24564] hwloc:base: no cpus specified - using root available 
> cpuset
> [Metropolis-01:24564] computing locality - getting object at level CORE, 
> index 1
> [Metropolis-01:24564] hwloc:base: get available cpus
> [Metropolis-01:24564] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24564] computing locality - shifting up from L1CACHE
> [Metropolis-01:24564] computing locality - shifting up from L2CACHE
> [Metropolis-01:24564] computing locality - shifting up from L3CACHE
> [Metropolis-01:24564] computing locality - filling level SOCKET
> [Metropolis-01:24564] computing locality - filling level NUMA
> [Metropolis-01:24564] locality: CL:CU:N:B:Nu:S
> [Metropolis-01:24565] mca:base:select:(grpcomm) Querying component [bad]
> [Metropolis-01:24565] mca:base:select:(grpcomm) Query of component [bad] set 
> priority to 10
> [Metropolis-01:24565] mca:base:select:(grpcomm) Selected component [bad]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive start comm
> [Metropolis-01:24564] mca: base: components_open: Looking for coll components
> [Metropolis-01:24564] mca: base: components_open: opening coll components
> [Metropolis-01:24564] mca: base: components_open: found loaded component tuned
> [Metropolis-01:24564] mca: base: components_open: component tuned has no 
> register function
> [Metropolis-01:24564] coll:tuned:component_open: done!
> [Metropolis-01:24564] mca: base: components_open: component tuned open 
> function successful
> [Metropolis-01:24564] mca: base: components_open: found loaded component sm
> [Metropolis-01:24564] mca: base: components_open: component sm register 
> function successful
> [Metropolis-01:24564] mca: base: components_open: component sm has no open 
> function
> [Metropolis-01:24564] mca: base: components_open: found loaded component 
> libnbc
> [Metropolis-01:24564] mca: base: components_open: component libnbc register 
> function successful
> [Metropolis-01:24564] mca: base: components_open: component libnbc open 
> function successful
> [Metropolis-01:24564] mca: base: components_open: found loaded component 
> hierarch
> [Metropolis-01:24564] mca: base: components_open: component hierarch has no 
> register function
> [Metropolis-01:24564] mca: base: components_open: component hierarch open 
> function successful
> [Metropolis-01:24564] mca: base: components_open: found loaded component basic
> [Metropolis-01:24564] mca: base: components_open: component basic register 
> function successful
> [Metropolis-01:24564] mca: base: components_open: component basic has no open 
> function
> [Metropolis-01:24564] mca: base: components_open: found loaded component inter
> [Metropolis-01:24564] mca: base: components_open: component inter has no 
> register function
> [Metropolis-01:24564] mca: base: components_open: component inter open 
> function successful
> [Metropolis-01:24564] mca: base: components_open: found loaded component self
> [Metropolis-01:24564] mca: base: components_open: component self has no 
> register function
> [Metropolis-01:24564] mca: base: components_open: component self open 
> function successful
> [Metropolis-01:24565] computing locality - getting object at level CORE, 
> index 1
> [Metropolis-01:24565] hwloc:base: get available cpus
> [Metropolis-01:24565] hwloc:base:get_available_cpus first time - filtering 
> cpus
> [Metropolis-01:24565] hwloc:base: no cpus specified - using root available 
> cpuset
> [Metropolis-01:24565] hwloc:base: get available cpus
> [Metropolis-01:24565] hwloc:base:filter_cpus specified - already done
> [Metropolis-01:24565] computing locality - getting object at level CORE, 
> index 0
> [Metropolis-01:24565] computing locality - shifting up from L1CACHE
> [Metropolis-01:24565] computing locality - shifting up from L2CACHE
> [Metropolis-01:24565] computing locality - shifting up from L3CACHE
> [Metropolis-01:24565] computing locality - filling level SOCKET
> [Metropolis-01:24565] computing locality - filling level NUMA
> [Metropolis-01:24565] locality: CL:CU:N:B:Nu:S
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO 
> PARTICIPANTS
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: performing modex
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:pack_modex: reporting 4 
> entries
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:full:modex: executing 
> allgather
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering allgather
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad allgather underway
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: modex posted
> [Metropolis-01:24565] mca: base: components_open: Looking for coll components
> [Metropolis-01:24565] mca: base: components_open: opening coll components
> [Metropolis-01:24565] mca: base: components_open: found loaded component tuned
> [Metropolis-01:24565] mca: base: components_open: component tuned has no 
> register function
> [Metropolis-01:24565] coll:tuned:component_open: done!
> [Metropolis-01:24565] mca: base: components_open: component tuned open 
> function successful
> [Metropolis-01:24565] mca: base: components_open: found loaded component sm
> [Metropolis-01:24565] mca: base: components_open: component sm register 
> function successful
> [Metropolis-01:24565] mca: base: components_open: component sm has no open 
> function
> [Metropolis-01:24565] mca: base: components_open: found loaded component 
> libnbc
> [Metropolis-01:24565] mca: base: components_open: component libnbc register 
> function successful
> [Metropolis-01:24565] mca: base: components_open: component libnbc open 
> function successful
> [Metropolis-01:24565] mca: base: components_open: found loaded component 
> hierarch
> [Metropolis-01:24565] mca: base: components_open: component hierarch has no 
> register function
> [Metropolis-01:24565] mca: base: components_open: component hierarch open 
> function successful
> [Metropolis-01:24565] mca: base: components_open: found loaded component basic
> [Metropolis-01:24565] mca: base: components_open: component basic register 
> function successful
> [Metropolis-01:24565] mca: base: components_open: component basic has no open 
> function
> [Metropolis-01:24565] mca: base: components_open: found loaded component inter
> [Metropolis-01:24565] mca: base: components_open: component inter has no 
> register function
> [Metropolis-01:24565] mca: base: components_open: component inter open 
> function successful
> [Metropolis-01:24565] mca: base: components_open: found loaded component self
> [Metropolis-01:24565] mca: base: components_open: component self has no 
> register function
> [Metropolis-01:24565] mca: base: components_open: component self open 
> function successful
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 0 LOCALLY COMPLETE - SENDING 
> TO GLOBAL COLLECTIVE
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon 
> collective recvd from [[36265,0],0]
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING 
> COLLECTIVE 0
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 2
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,1] 
> tag 30
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list 
> is empty!
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: performing modex
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:pack_modex: reporting 4 
> entries
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:full:modex: executing 
> allgather
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering allgather
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad allgather underway
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: modex posted
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing 
> collective return for id 0
> [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 0
> [Metropolis-01:24564] [[36265,1],0] STORING MODEX DATA
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex 
> entry for proc [[36265,1],0]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing 
> collective return for id 0
> [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 0
> [Metropolis-01:24565] [[36265,1],1] STORING MODEX DATA
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex 
> entry for proc [[36265,1],0]
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries: adding 
> 4 entries for proc [[36265,1],0]
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex 
> entry for proc [[36265,1],1]
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries: adding 
> 4 entries for proc [[36265,1],1]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries: adding 
> 4 entries for proc [[36265,1],0]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex 
> entry for proc [[36265,1],1]
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries: adding 
> 4 entries for proc [[36265,1],1]
> [Metropolis-01:24564] coll:find_available: querying coll component tuned
> [Metropolis-01:24564] coll:find_available: coll component tuned is available
> [Metropolis-01:24565] coll:find_available: querying coll component tuned
> [Metropolis-01:24565] coll:find_available: coll component tuned is available
> [Metropolis-01:24565] coll:find_available: querying coll component sm
> [Metropolis-01:24564] coll:find_available: querying coll component sm
> [Metropolis-01:24564] coll:sm:init_query: no other local procs; disqualifying 
> myself
> [Metropolis-01:24564] coll:find_available: coll component sm is not available
> [Metropolis-01:24564] coll:find_available: querying coll component libnbc
> [Metropolis-01:24564] coll:find_available: coll component libnbc is available
> [Metropolis-01:24564] coll:find_available: querying coll component hierarch
> [Metropolis-01:24564] coll:find_available: coll component hierarch is 
> available
> [Metropolis-01:24564] coll:find_available: querying coll component basic
> [Metropolis-01:24564] coll:find_available: coll component basic is available
> [Metropolis-01:24565] coll:sm:init_query: no other local procs; disqualifying 
> myself
> [Metropolis-01:24565] coll:find_available: coll component sm is not available
> [Metropolis-01:24565] coll:find_available: querying coll component libnbc
> [Metropolis-01:24565] coll:find_available: coll component libnbc is available
> [Metropolis-01:24565] coll:find_available: querying coll component hierarch
> [Metropolis-01:24565] coll:find_available: coll component hierarch is 
> available
> [Metropolis-01:24565] coll:find_available: querying coll component basic
> [Metropolis-01:24565] coll:find_available: coll component basic is available
> [Metropolis-01:24564] coll:find_available: querying coll component inter
> [Metropolis-01:24564] coll:find_available: coll component inter is available
> [Metropolis-01:24564] coll:find_available: querying coll component self
> [Metropolis-01:24564] coll:find_available: coll component self is available
> [Metropolis-01:24565] coll:find_available: querying coll component inter
> [Metropolis-01:24565] coll:find_available: coll component inter is available
> [Metropolis-01:24565] coll:find_available: querying coll component self
> [Metropolis-01:24565] coll:find_available: coll component self is available
> [Metropolis-01:24565] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
> [Metropolis-01:24564] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO 
> PARTICIPANTS
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 1 LOCALLY COMPLETE - SENDING 
> TO GLOBAL COLLECTIVE
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon 
> collective recvd from [[36265,0],0]
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING 
> COLLECTIVE 1
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 2
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,1] 
> tag 30
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list 
> is empty!
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing 
> collective return for id 1
> [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 1
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing 
> collective return for id 1
> [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 1
> [Metropolis-01:24565] coll:base:comm_select: new communicator: MPI_COMM_WORLD 
> (cid 0)
> [Metropolis-01:24565] coll:base:comm_select: Checking all available modules
> [Metropolis-01:24565] coll:tuned:module_tuned query called
> [Metropolis-01:24565] coll:base:comm_select: component available: tuned, 
> priority: 30
> [Metropolis-01:24565] coll:base:comm_select: component available: libnbc, 
> priority: 10
> [Metropolis-01:24565] coll:base:comm_select: component not available: hierarch
> [Metropolis-01:24565] coll:base:comm_select: component available: basic, 
> priority: 10
> [Metropolis-01:24565] coll:base:comm_select: component not available: inter
> [Metropolis-01:24565] coll:base:comm_select: component not available: self
> [Metropolis-01:24565] coll:tuned:module_init called.
> [Metropolis-01:24565] coll:tuned:module_init Tuned is in use
> [Metropolis-01:24565] coll:base:comm_select: new communicator: MPI_COMM_SELF 
> (cid 1)
> [Metropolis-01:24565] coll:base:comm_select: Checking all available modules
> [Metropolis-01:24564] coll:base:comm_select: new communicator: MPI_COMM_WORLD 
> (cid 0)
> [Metropolis-01:24564] coll:base:comm_select: Checking all available modules
> [Metropolis-01:24564] coll:tuned:module_tuned query called
> [Metropolis-01:24564] coll:base:comm_select: component available: tuned, 
> priority: 30
> [Metropolis-01:24564] coll:base:comm_select: component available: libnbc, 
> priority: 10
> [Metropolis-01:24564] coll:base:comm_select: component not available: hierarch
> [Metropolis-01:24564] coll:base:comm_select: component available: basic, 
> priority: 10
> [Metropolis-01:24564] coll:base:comm_select: component not available: inter
> [Metropolis-01:24564] coll:base:comm_select: component not available: self
> [Metropolis-01:24564] coll:tuned:module_init called.
> [Metropolis-01:24565] coll:tuned:module_tuned query called
> [Metropolis-01:24565] coll:base:comm_select: component not available: tuned
> [Metropolis-01:24565] coll:base:comm_select: component available: libnbc, 
> priority: 10
> [Metropolis-01:24565] coll:base:comm_select: component not available: hierarch
> [Metropolis-01:24565] coll:base:comm_select: component available: basic, 
> priority: 10
> [Metropolis-01:24565] coll:base:comm_select: component not available: inter
> [Metropolis-01:24565] coll:base:comm_select: component available: self, 
> priority: 75
> [Metropolis-01:24564] coll:tuned:module_init Tuned is in use
> [Metropolis-01:24564] coll:base:comm_select: new communicator: MPI_COMM_SELF 
> (cid 1)
> [Metropolis-01:24564] coll:base:comm_select: Checking all available modules
> [Metropolis-01:24564] coll:tuned:module_tuned query called
> [Metropolis-01:24564] coll:base:comm_select: component not available: tuned
> [Metropolis-01:24564] coll:base:comm_select: component available: libnbc, 
> priority: 10
> [Metropolis-01:24564] coll:base:comm_select: component not available: hierarch
> [Metropolis-01:24564] coll:base:comm_select: component available: basic, 
> priority: 10
> [Metropolis-01:24564] coll:base:comm_select: component not available: inter
> [Metropolis-01:24564] coll:base:comm_select: component available: self, 
> priority: 75
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO 
> PARTICIPANTS
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2
> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 2 LOCALLY COMPLETE - SENDING 
> TO GLOBAL COLLECTIVE
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon 
> collective recvd from [[36265,0],0]
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING 
> COLLECTIVE 2
> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM CONTRIBS: 2
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,1] 
> tag 30
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list 
> is empty!
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier
> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing 
> collective return for id 2
> [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 2
> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing 
> collective return for id 2
> [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 2
> [Metropolis-01:24565] coll:tuned:component_close: called
> [Metropolis-01:24565] coll:tuned:component_close: done!
> [Metropolis-01:24565] mca: base: close: component tuned closed
> [Metropolis-01:24565] mca: base: close: unloading component tuned
> [Metropolis-01:24565] mca: base: close: component libnbc closed
> [Metropolis-01:24565] mca: base: close: unloading component libnbc
> [Metropolis-01:24565] mca: base: close: unloading component hierarch
> [Metropolis-01:24565] mca: base: close: unloading component basic
> [Metropolis-01:24565] mca: base: close: unloading component inter
> [Metropolis-01:24565] mca: base: close: unloading component self
> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive stop comm
> [Metropolis-01:24564] coll:tuned:component_close: called
> [Metropolis-01:24564] coll:tuned:component_close: done!
> [Metropolis-01:24564] mca: base: close: component tuned closed
> [Metropolis-01:24564] mca: base: close: unloading component tuned
> [Metropolis-01:24564] mca: base: close: component libnbc closed
> [Metropolis-01:24564] mca: base: close: unloading component libnbc
> [Metropolis-01:24564] mca: base: close: unloading component hierarch
> [Metropolis-01:24564] mca: base: close: unloading component basic
> [Metropolis-01:24564] mca: base: close: unloading component inter
> [Metropolis-01:24564] mca: base: close: unloading component self
> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive stop comm
> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job [36265,0] 
> tag 1
> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient list 
> is empty!
> [jarico@Metropolis-01 examples]$ 
> 
> 
> 
> El 03/07/2012, a las 21:44, Ralph Castain escribió:
> 
>> Interesting - yes, coll sm doesn't think they are on the same node for some 
>> reason. Try adding -mca grpcomm_base_verbose 5 and let's see why
>> 
>> 
>> On Jul 3, 2012, at 1:24 PM, Juan Antonio Rico Gallego wrote:
>> 
>>> The code I run is a simple broadcast. 
>>> 
>>> When I do not specify components to run, the output is (more verbose):
>>> 
>>> [jarico@Metropolis-01 examples]$ 
>>> /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --mca 
>>> mca_base_verbose 100 --mca mca_coll_base_output 100  --mca coll_sm_priority 
>>> 99 -mca hwloc_base_verbose 90 --display-map --mca mca_verbose 100 --mca 
>>> mca_base_verbose 100 --mca coll_base_verbose 100 -n 2 ./bmem
>>> [Metropolis-01:24490] mca: base: components_open: Looking for hwloc 
>>> components
>>> [Metropolis-01:24490] mca: base: components_open: opening hwloc components
>>> [Metropolis-01:24490] mca: base: components_open: found loaded component 
>>> hwloc142
>>> [Metropolis-01:24490] mca: base: components_open: component hwloc142 has no 
>>> register function
>>> [Metropolis-01:24490] mca: base: components_open: component hwloc142 has no 
>>> open function
>>> [Metropolis-01:24490] hwloc:base:get_topology
>>> [Metropolis-01:24490] hwloc:base: no cpus specified - using root available 
>>> cpuset
>>> 
>>> ========================   JOB MAP   ========================
>>> 
>>> Data for node: Metropolis-01        Num procs: 2
>>>     Process OMPI jobid: [36336,1] App: 0 Process rank: 0
>>>     Process OMPI jobid: [36336,1] App: 0 Process rank: 1
>>> 
>>> =============================================================
>>> [Metropolis-01:24491] mca: base: components_open: Looking for hwloc 
>>> components
>>> [Metropolis-01:24491] mca: base: components_open: opening hwloc components
>>> [Metropolis-01:24491] mca: base: components_open: found loaded component 
>>> hwloc142
>>> [Metropolis-01:24491] mca: base: components_open: component hwloc142 has no 
>>> register function
>>> [Metropolis-01:24491] mca: base: components_open: component hwloc142 has no 
>>> open function
>>> [Metropolis-01:24492] mca: base: components_open: Looking for hwloc 
>>> components
>>> [Metropolis-01:24492] mca: base: components_open: opening hwloc components
>>> [Metropolis-01:24492] mca: base: components_open: found loaded component 
>>> hwloc142
>>> [Metropolis-01:24492] mca: base: components_open: component hwloc142 has no 
>>> register function
>>> [Metropolis-01:24492] mca: base: components_open: component hwloc142 has no 
>>> open function
>>> [Metropolis-01:24491] locality: CL:CU:N:B
>>> [Metropolis-01:24491] hwloc:base: get available cpus
>>> [Metropolis-01:24491] hwloc:base:get_available_cpus first time - filtering 
>>> cpus
>>> [Metropolis-01:24491] hwloc:base: no cpus specified - using root available 
>>> cpuset
>>> [Metropolis-01:24491] hwloc:base:get_available_cpus root object
>>> [Metropolis-01:24491] mca: base: components_open: Looking for coll 
>>> components
>>> [Metropolis-01:24491] mca: base: components_open: opening coll components
>>> [Metropolis-01:24491] mca: base: components_open: found loaded component 
>>> tuned
>>> [Metropolis-01:24491] mca: base: components_open: component tuned has no 
>>> register function
>>> [Metropolis-01:24491] coll:tuned:component_open: done!
>>> [Metropolis-01:24491] mca: base: components_open: component tuned open 
>>> function successful
>>> [Metropolis-01:24491] mca: base: components_open: found loaded component sm
>>> [Metropolis-01:24491] mca: base: components_open: component sm register 
>>> function successful
>>> [Metropolis-01:24491] mca: base: components_open: component sm has no open 
>>> function
>>> [Metropolis-01:24491] mca: base: components_open: found loaded component 
>>> libnbc
>>> [Metropolis-01:24491] mca: base: components_open: component libnbc register 
>>> function successful
>>> [Metropolis-01:24491] mca: base: components_open: component libnbc open 
>>> function successful
>>> [Metropolis-01:24491] mca: base: components_open: found loaded component 
>>> hierarch
>>> [Metropolis-01:24491] mca: base: components_open: component hierarch has no 
>>> register function
>>> [Metropolis-01:24491] mca: base: components_open: component hierarch open 
>>> function successful
>>> [Metropolis-01:24491] mca: base: components_open: found loaded component 
>>> basic
>>> [Metropolis-01:24491] mca: base: components_open: component basic register 
>>> function successful
>>> [Metropolis-01:24491] mca: base: components_open: component basic has no 
>>> open function
>>> [Metropolis-01:24491] mca: base: components_open: found loaded component 
>>> inter
>>> [Metropolis-01:24491] mca: base: components_open: component inter has no 
>>> register function
>>> [Metropolis-01:24491] mca: base: components_open: component inter open 
>>> function successful
>>> [Metropolis-01:24491] mca: base: components_open: found loaded component 
>>> self
>>> [Metropolis-01:24491] mca: base: components_open: component self has no 
>>> register function
>>> [Metropolis-01:24491] mca: base: components_open: component self open 
>>> function successful
>>> [Metropolis-01:24492] locality: CL:CU:N:B
>>> [Metropolis-01:24492] hwloc:base: get available cpus
>>> [Metropolis-01:24492] hwloc:base:get_available_cpus first time - filtering 
>>> cpus
>>> [Metropolis-01:24492] hwloc:base: no cpus specified - using root available 
>>> cpuset
>>> [Metropolis-01:24492] hwloc:base:get_available_cpus root object
>>> [Metropolis-01:24492] mca: base: components_open: Looking for coll 
>>> components
>>> [Metropolis-01:24492] mca: base: components_open: opening coll components
>>> [Metropolis-01:24492] mca: base: components_open: found loaded component 
>>> tuned
>>> [Metropolis-01:24492] mca: base: components_open: component tuned has no 
>>> register function
>>> [Metropolis-01:24492] coll:tuned:component_open: done!
>>> [Metropolis-01:24492] mca: base: components_open: component tuned open 
>>> function successful
>>> [Metropolis-01:24492] mca: base: components_open: found loaded component sm
>>> [Metropolis-01:24492] mca: base: components_open: component sm register 
>>> function successful
>>> [Metropolis-01:24492] mca: base: components_open: component sm has no open 
>>> function
>>> [Metropolis-01:24492] mca: base: components_open: found loaded component 
>>> libnbc
>>> [Metropolis-01:24492] mca: base: components_open: component libnbc register 
>>> function successful
>>> [Metropolis-01:24492] mca: base: components_open: component libnbc open 
>>> function successful
>>> [Metropolis-01:24492] mca: base: components_open: found loaded component 
>>> hierarch
>>> [Metropolis-01:24492] mca: base: components_open: component hierarch has no 
>>> register function
>>> [Metropolis-01:24492] mca: base: components_open: component hierarch open 
>>> function successful
>>> [Metropolis-01:24492] mca: base: components_open: found loaded component 
>>> basic
>>> [Metropolis-01:24492] mca: base: components_open: component basic register 
>>> function successful
>>> [Metropolis-01:24492] mca: base: components_open: component basic has no 
>>> open function
>>> [Metropolis-01:24492] mca: base: components_open: found loaded component 
>>> inter
>>> [Metropolis-01:24492] mca: base: components_open: component inter has no 
>>> register function
>>> [Metropolis-01:24492] mca: base: components_open: component inter open 
>>> function successful
>>> [Metropolis-01:24492] mca: base: components_open: found loaded component 
>>> self
>>> [Metropolis-01:24492] mca: base: components_open: component self has no 
>>> register function
>>> [Metropolis-01:24492] mca: base: components_open: component self open 
>>> function successful
>>> [Metropolis-01:24491] coll:find_available: querying coll component tuned
>>> [Metropolis-01:24491] coll:find_available: coll component tuned is available
>>> [Metropolis-01:24491] coll:find_available: querying coll component sm
>>> [Metropolis-01:24491] coll:sm:init_query: no other local procs; 
>>> disqualifying myself
>>> [Metropolis-01:24491] coll:find_available: coll component sm is not 
>>> available
>>> [Metropolis-01:24491] coll:find_available: querying coll component libnbc
>>> [Metropolis-01:24491] coll:find_available: coll component libnbc is 
>>> available
>>> [Metropolis-01:24491] coll:find_available: querying coll component hierarch
>>> [Metropolis-01:24491] coll:find_available: coll component hierarch is 
>>> available
>>> [Metropolis-01:24491] coll:find_available: querying coll component basic
>>> [Metropolis-01:24491] coll:find_available: coll component basic is available
>>> [Metropolis-01:24491] coll:find_available: querying coll component inter
>>> [Metropolis-01:24492] coll:find_available: querying coll component tuned
>>> [Metropolis-01:24492] coll:find_available: coll component tuned is available
>>> [Metropolis-01:24492] coll:find_available: querying coll component sm
>>> [Metropolis-01:24492] coll:sm:init_query: no other local procs; 
>>> disqualifying myself
>>> [Metropolis-01:24492] coll:find_available: coll component sm is not 
>>> available
>>> [Metropolis-01:24492] coll:find_available: querying coll component libnbc
>>> [Metropolis-01:24492] coll:find_available: coll component libnbc is 
>>> available
>>> [Metropolis-01:24492] coll:find_available: querying coll component hierarch
>>> [Metropolis-01:24492] coll:find_available: coll component hierarch is 
>>> available
>>> [Metropolis-01:24492] coll:find_available: querying coll component basic
>>> [Metropolis-01:24492] coll:find_available: coll component basic is available
>>> [Metropolis-01:24492] coll:find_available: querying coll component inter
>>> [Metropolis-01:24492] coll:find_available: coll component inter is available
>>> [Metropolis-01:24492] coll:find_available: querying coll component self
>>> [Metropolis-01:24492] coll:find_available: coll component self is available
>>> [Metropolis-01:24491] coll:find_available: coll component inter is available
>>> [Metropolis-01:24491] coll:find_available: querying coll component self
>>> [Metropolis-01:24491] coll:find_available: coll component self is available
>>> [Metropolis-01:24492] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
>>> [Metropolis-01:24491] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
>>> [Metropolis-01:24491] coll:base:comm_select: new communicator: 
>>> MPI_COMM_WORLD (cid 0)
>>> [Metropolis-01:24491] coll:base:comm_select: Checking all available modules
>>> [Metropolis-01:24491] coll:tuned:module_tuned query called
>>> [Metropolis-01:24491] coll:base:comm_select: component available: tuned, 
>>> priority: 30
>>> [Metropolis-01:24491] coll:base:comm_select: component available: libnbc, 
>>> priority: 10
>>> [Metropolis-01:24491] coll:base:comm_select: component not available: 
>>> hierarch
>>> [Metropolis-01:24491] coll:base:comm_select: component available: basic, 
>>> priority: 10
>>> [Metropolis-01:24491] coll:base:comm_select: component not available: inter
>>> [Metropolis-01:24491] coll:base:comm_select: component not available: self
>>> [Metropolis-01:24491] coll:tuned:module_init called.
>>> [Metropolis-01:24491] coll:tuned:module_init Tuned is in use
>>> [Metropolis-01:24491] coll:base:comm_select: new communicator: 
>>> MPI_COMM_SELF (cid 1)
>>> [Metropolis-01:24491] coll:base:comm_select: Checking all available modules
>>> [Metropolis-01:24491] coll:tuned:module_tuned query called
>>> [Metropolis-01:24491] coll:base:comm_select: component not available: tuned
>>> [Metropolis-01:24491] coll:base:comm_select: component available: libnbc, 
>>> priority: 10
>>> [Metropolis-01:24491] coll:base:comm_select: component not available: 
>>> hierarch
>>> [Metropolis-01:24491] coll:base:comm_select: component available: basic, 
>>> priority: 10
>>> [Metropolis-01:24491] coll:base:comm_select: component not available: inter
>>> [Metropolis-01:24491] coll:base:comm_select: component available: self, 
>>> priority: 75
>>> [Metropolis-01:24492] coll:base:comm_select: new communicator: 
>>> MPI_COMM_WORLD (cid 0)
>>> [Metropolis-01:24492] coll:base:comm_select: Checking all available modules
>>> [Metropolis-01:24492] coll:tuned:module_tuned query called
>>> [Metropolis-01:24492] coll:base:comm_select: component available: tuned, 
>>> priority: 30
>>> [Metropolis-01:24492] coll:base:comm_select: component available: libnbc, 
>>> priority: 10
>>> [Metropolis-01:24492] coll:base:comm_select: component not available: 
>>> hierarch
>>> [Metropolis-01:24492] coll:base:comm_select: component available: basic, 
>>> priority: 10
>>> [Metropolis-01:24492] coll:base:comm_select: component not available: inter
>>> [Metropolis-01:24492] coll:base:comm_select: component not available: self
>>> [Metropolis-01:24492] coll:tuned:module_init called.
>>> [Metropolis-01:24492] coll:tuned:module_init Tuned is in use
>>> [Metropolis-01:24492] coll:base:comm_select: new communicator: 
>>> MPI_COMM_SELF (cid 1)
>>> [Metropolis-01:24492] coll:base:comm_select: Checking all available modules
>>> [Metropolis-01:24492] coll:tuned:module_tuned query called
>>> [Metropolis-01:24492] coll:base:comm_select: component not available: tuned
>>> [Metropolis-01:24492] coll:base:comm_select: component available: libnbc, 
>>> priority: 10
>>> [Metropolis-01:24492] coll:base:comm_select: component not available: 
>>> hierarch
>>> [Metropolis-01:24492] coll:base:comm_select: component available: basic, 
>>> priority: 10
>>> [Metropolis-01:24492] coll:base:comm_select: component not available: inter
>>> [Metropolis-01:24492] coll:base:comm_select: component available: self, 
>>> priority: 75
>>> [Metropolis-01:24491] coll:tuned:component_close: called
>>> [Metropolis-01:24491] coll:tuned:component_close: done!
>>> [Metropolis-01:24492] coll:tuned:component_close: called
>>> [Metropolis-01:24492] coll:tuned:component_close: done!
>>> [Metropolis-01:24492] mca: base: close: component tuned closed
>>> [Metropolis-01:24492] mca: base: close: unloading component tuned
>>> [Metropolis-01:24492] mca: base: close: component libnbc closed
>>> [Metropolis-01:24492] mca: base: close: unloading component libnbc
>>> [Metropolis-01:24492] mca: base: close: unloading component hierarch
>>> [Metropolis-01:24492] mca: base: close: unloading component basic
>>> [Metropolis-01:24492] mca: base: close: unloading component inter
>>> [Metropolis-01:24492] mca: base: close: unloading component self
>>> [Metropolis-01:24491] mca: base: close: component tuned closed
>>> [Metropolis-01:24491] mca: base: close: unloading component tuned
>>> [Metropolis-01:24491] mca: base: close: component libnbc closed
>>> [Metropolis-01:24491] mca: base: close: unloading component libnbc
>>> [Metropolis-01:24491] mca: base: close: unloading component hierarch
>>> [Metropolis-01:24491] mca: base: close: unloading component basic
>>> [Metropolis-01:24491] mca: base: close: unloading component inter
>>> [Metropolis-01:24491] mca: base: close: unloading component self
>>> [jarico@Metropolis-01 examples]$ 
>>> 
>>> 
>>> SM is not load because it detects no other processes in the same machine:
>>> 
>>> [Metropolis-01:24491] coll:sm:init_query: no other local procs; 
>>> disqualifying myself
>>> 
>>> The machine is a multicore machine with 8 cores.
>>> 
>>> I need to run SM component code, and I suppose that raising priority it 
>>> will be the component selected when problem is solved.
>>> 
>>> 
>>> 
>>> El 03/07/2012, a las 21:01, Jeff Squyres escribió:
>>> 
>>>> The issue is that the "sm" coll component only implements a few of the MPI 
>>>> collective operations.  It is usually mixed at run-time with other coll 
>>>> components to fill out the rest of the MPI collective operations.
>>>> 
>>>> So what is happening is that OMPI is determining that it doesn't have 
>>>> implementations of all the MPI collective operations and aborting.
>>>> 
>>>> You shouldn't need to manually select your coll module -- OMPI should 
>>>> automatically select the right collective module for you.  E.g., if all 
>>>> procs are local on a single machine and sm has a matching implementation 
>>>> for that MPI collective operation, it'll be used.
>>>> 
>>>> 
>>>> 
>>>> On Jul 3, 2012, at 2:48 PM, Juan Antonio Rico Gallego wrote:
>>>> 
>>>>> Output is:
>>>>> 
>>>>> [Metropolis-01:15355] hwloc:base:get_topology
>>>>> [Metropolis-01:15355] hwloc:base: no cpus specified - using root 
>>>>> available cpuset
>>>>> 
>>>>> ========================   JOB MAP   ========================
>>>>> 
>>>>> Data for node: Metropolis-01      Num procs: 2
>>>>>   Process OMPI jobid: [59809,1] App: 0 Process rank: 0
>>>>>   Process OMPI jobid: [59809,1] App: 0 Process rank: 1
>>>>> 
>>>>> =============================================================
>>>>> [Metropolis-01:15356] locality: CL:CU:N:B
>>>>> [Metropolis-01:15356] hwloc:base: get available cpus
>>>>> [Metropolis-01:15356] hwloc:base:get_available_cpus first time - 
>>>>> filtering cpus
>>>>> [Metropolis-01:15356] hwloc:base: no cpus specified - using root 
>>>>> available cpuset
>>>>> [Metropolis-01:15356] hwloc:base:get_available_cpus root object
>>>>> [Metropolis-01:15357] locality: CL:CU:N:B
>>>>> [Metropolis-01:15357] hwloc:base: get available cpus
>>>>> [Metropolis-01:15357] hwloc:base:get_available_cpus first time - 
>>>>> filtering cpus
>>>>> [Metropolis-01:15357] hwloc:base: no cpus specified - using root 
>>>>> available cpuset
>>>>> [Metropolis-01:15357] hwloc:base:get_available_cpus root object
>>>>> [Metropolis-01:15356] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
>>>>> [Metropolis-01:15357] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
>>>>> 
>>>>> 
>>>>> Regards,
>>>>> Juan A. Rico
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> de...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> 
>>>> 
>>>> -- 
>>>> Jeff Squyres
>>>> jsquy...@cisco.com
>>>> For corporate legal information go to: 
>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>> 
>>>> 
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to