You're right, the code was overzealous. I fix it by removing the parsing of the 
modex data completely. In any case, the collective module has another chance of 
deselecting itself, upon creation of a new communicator (thus, after the modex 
was completed).

  George



On Jul 6, 2012, at 2:20, Ralph Castain <rhc.open...@gmail.com> wrote:

> George: is there any reason for opening and selecting the coll framework so 
> early in mpi_init? I'm wondering if we can move that code to the end of the 
> procedure so we wouldn't need the locality info until later.
> 
> Sent from my iPad
> 
> On Jul 5, 2012, at 10:05 AM, Jeff Squyres <jsquy...@cisco.com> wrote:
> 
>> Thanks George.  I filed https://svn.open-mpi.org/trac/ompi/ticket/3162 about 
>> this.
>> 
>> 
>> On Jul 4, 2012, at 5:34 AM, Juan A. Rico wrote:
>> 
>>> Thanks all of you for your time and early responses.
>>> 
>>> After applying the patch, SM can be used by raising its priority. It is 
>>> enough for me (I hope so). But it continues failing when I specify --mca 
>>> coll sm,self in the command line (with tuned too).
>>> I am not going to use this release in production, only for playing with the 
>>> code :-)
>>> 
>>> Regards,
>>> Juan Antonio.
>>> 
>>> El 04/07/2012, a las 02:59, George Bosilca escribió:
>>> 
>>>> Juan,
>>>> 
>>>> Something weird is going on there. The selection mechanism for the SM coll 
>>>> and SM BTL should be very similar. However, the SM BTL successfully select 
>>>> itself while the SM coll fails to determine that all processes are local.
>>>> 
>>>> In the coll SM the issue is that the remote procs do not have the LOCAL 
>>>> flag set, even when they are on the local node (however the 
>>>> ompi_proc_local() return has a special flag stating that all processes in 
>>>> the job are local). I compared the initialization of the SM BTL and the SM 
>>>> coll. It turns out that somehow the procs returned by ompi_proc_all() and 
>>>> the procs provided to the add_proc of the BTLs are not identical. The 
>>>> second have the local flag correctly set, so I went a little bit deeper.
>>>> 
>>>> Here is what I found while toying with gdb inside:
>>>> 
>>>> breakpoint 1, mca_coll_sm_init_query (enable_progress_threads=false, 
>>>> enable_mpi_threads=false) at coll_sm_module.c:132
>>>> 
>>>> (gdb) p procs[0]
>>>> $1 = (ompi_proc_t *) 0x109a1e8c0
>>>> (gdb) p procs[1]
>>>> $2 = (ompi_proc_t *) 0x109a1e970
>>>> (gdb) p procs[0]->proc_flags
>>>> $3 = 0
>>>> (gdb) p procs[1]->proc_flags
>>>> $4 = 4095
>>>> 
>>>> Breakpoint 2, mca_btl_sm_add_procs (btl=0x109baa1c0, nprocs=2, 
>>>> procs=0x109a319e0, peers=0x109a319f0, reachability=0x7fff691378e8) at 
>>>> btl_sm.c:427
>>>> 
>>>> (gdb) p procs[0]
>>>> $5 = (struct ompi_proc_t *) 0x109a1e8c0
>>>> (gdb) p procs[1]
>>>> $6 = (struct ompi_proc_t *) 0x109a1e970
>>>> (gdb) p procs[0]->proc_flags
>>>> $7 = 1920
>>>> (gdb) p procs[1]->proc_flags
>>>> $8 = 4095
>>>> 
>>>> Thus the problem seems to come from the fact that during the 
>>>> initialization of the SM coll the flags are not correctly set. However, 
>>>> this is somehow expected … as the call to the initialization happens 
>>>> before the exchange of the business cards (and therefore there is no way 
>>>> to have any knowledge about the remote procs).
>>>> 
>>>> So, either something changed drastically in the way we set the flags for 
>>>> remote processes or we did not use the SM coll for the last 3 years. I 
>>>> think the culprit is r21967 
>>>> (https://svn.open-mpi.org/trac/ompi/changeset/21967) who added a 
>>>> "selection" logic based on knowledge about remote procs in the coll SM 
>>>> initialization function. But this selection logic was way to early !!!
>>>> 
>>>> I would strongly encourage you not to use this SM collective component in 
>>>> anything related to production runs.
>>>> 
>>>> george.
>>>> 
>>>> PS: However, if you want to toy with the SM coll apply the following patch:
>>>> Index: coll_sm_module.c
>>>> ===================================================================
>>>> --- coll_sm_module.c    (revision 26737)
>>>> +++ coll_sm_module.c    (working copy)
>>>> @@ -128,6 +128,7 @@
>>>> int mca_coll_sm_init_query(bool enable_progress_threads,
>>>>                           bool enable_mpi_threads)
>>>> {
>>>> +#if 0
>>>>    ompi_proc_t *my_proc, **procs;
>>>>    size_t i, size;
>>>> 
>>>> @@ -158,7 +159,7 @@
>>>>                            "coll:sm:init_query: no other local procs; 
>>>> disqualifying myself");
>>>>        return OMPI_ERR_NOT_AVAILABLE;
>>>>    }
>>>> -
>>>> +#endif
>>>>    /* Don't do much here because we don't really want to allocate any
>>>>       shared memory until this component is selected to be used. */
>>>>    opal_output_verbose(10, mca_coll_base_output,
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Jul 4, 2012, at 02:05 , Ralph Castain wrote:
>>>> 
>>>>> Okay, please try this again with r26739 or above. You can remove the rest 
>>>>> of the "verbose" settings and the --display-map so we declutter the 
>>>>> output. Please add "-mca orte_nidmap_verbose 20" to your cmd line.
>>>>> 
>>>>> Thanks!
>>>>> Ralph
>>>>> 
>>>>> 
>>>>> On Tue, Jul 3, 2012 at 1:50 PM, Juan A. Rico <jar...@unex.es> wrote:
>>>>> Here is the output.
>>>>> 
>>>>> [jarico@Metropolis-01 examples]$ 
>>>>> /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --bind-to-core 
>>>>> --bynode --mca mca_base_verbose 100 --mca mca_coll_base_output 100  --mca 
>>>>> coll_sm_priority 99 -mca hwloc_base_verbose 90 --display-map --mca 
>>>>> mca_verbose 100 --mca mca_base_verbose 100 --mca coll_base_verbose 100 -n 
>>>>> 2 -mca grpcomm_base_verbose 5 ./bmem
>>>>> [Metropolis-01:24563] mca: base: components_open: Looking for hwloc 
>>>>> components
>>>>> [Metropolis-01:24563] mca: base: components_open: opening hwloc components
>>>>> [Metropolis-01:24563] mca: base: components_open: found loaded component 
>>>>> hwloc142
>>>>> [Metropolis-01:24563] mca: base: components_open: component hwloc142 has 
>>>>> no register function
>>>>> [Metropolis-01:24563] mca: base: components_open: component hwloc142 has 
>>>>> no open function
>>>>> [Metropolis-01:24563] hwloc:base:get_topology
>>>>> [Metropolis-01:24563] hwloc:base: no cpus specified - using root 
>>>>> available cpuset
>>>>> [Metropolis-01:24563] mca:base:select:(grpcomm) Querying component [bad]
>>>>> [Metropolis-01:24563] mca:base:select:(grpcomm) Query of component [bad] 
>>>>> set priority to 10
>>>>> [Metropolis-01:24563] mca:base:select:(grpcomm) Selected component [bad]
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:receive start comm
>>>>> --------------------------------------------------------------------------
>>>>> WARNING: a request was made to bind a process. While the system
>>>>> supports binding the process itself, at least one node does NOT
>>>>> support binding memory to the process location.
>>>>> 
>>>>> Node:  Metropolis-01
>>>>> 
>>>>> This is a warning only; your job will continue, though performance may
>>>>> be degraded.
>>>>> --------------------------------------------------------------------------
>>>>> [Metropolis-01:24563] hwloc:base: get available cpus
>>>>> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
>>>>> [Metropolis-01:24563] hwloc:base: get available cpus
>>>>> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
>>>>> [Metropolis-01:24563] hwloc:base: get available cpus
>>>>> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
>>>>> [Metropolis-01:24563] hwloc:base: get available cpus
>>>>> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
>>>>> [Metropolis-01:24563] hwloc:base: get available cpus
>>>>> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
>>>>> [Metropolis-01:24563] hwloc:base: get available cpus
>>>>> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
>>>>> [Metropolis-01:24563] hwloc:base: get available cpus
>>>>> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
>>>>> [Metropolis-01:24563] hwloc:base: get available cpus
>>>>> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
>>>>> [Metropolis-01:24563] hwloc:base:get_nbojbs computed data 8 of Core:0
>>>>> [Metropolis-01:24563] hwloc:base: get available cpus
>>>>> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
>>>>> [Metropolis-01:24563] hwloc:base: get available cpus
>>>>> [Metropolis-01:24563] hwloc:base:filter_cpus specified - already done
>>>>> 
>>>>> ========================   JOB MAP   ========================
>>>>> 
>>>>> Data for node: Metropolis-01   Num procs: 2
>>>>>       Process OMPI jobid: [36265,1] App: 0 Process rank: 0
>>>>>       Process OMPI jobid: [36265,1] App: 0 Process rank: 1
>>>>> 
>>>>> =============================================================
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job 
>>>>> [36265,0] tag 1
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:xcast updating daemon 
>>>>> nidmap
>>>>> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient 
>>>>> list is empty!
>>>>> [Metropolis-01:24564] mca: base: components_open: Looking for hwloc 
>>>>> components
>>>>> [Metropolis-01:24564] mca: base: components_open: opening hwloc components
>>>>> [Metropolis-01:24564] mca: base: components_open: found loaded component 
>>>>> hwloc142
>>>>> [Metropolis-01:24564] mca: base: components_open: component hwloc142 has 
>>>>> no register function
>>>>> [Metropolis-01:24564] mca: base: components_open: component hwloc142 has 
>>>>> no open function
>>>>> [Metropolis-01:24565] mca: base: components_open: Looking for hwloc 
>>>>> components
>>>>> [Metropolis-01:24565] mca: base: components_open: opening hwloc components
>>>>> [Metropolis-01:24565] mca: base: components_open: found loaded component 
>>>>> hwloc142
>>>>> [Metropolis-01:24565] mca: base: components_open: component hwloc142 has 
>>>>> no register function
>>>>> [Metropolis-01:24565] mca: base: components_open: component hwloc142 has 
>>>>> no open function
>>>>> [Metropolis-01:24564] mca:base:select:(grpcomm) Querying component [bad]
>>>>> [Metropolis-01:24564] mca:base:select:(grpcomm) Query of component [bad] 
>>>>> set priority to 10
>>>>> [Metropolis-01:24564] mca:base:select:(grpcomm) Selected component [bad]
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive start comm
>>>>> [Metropolis-01:24564] computing locality - getting object at level CORE, 
>>>>> index 0
>>>>> [Metropolis-01:24564] hwloc:base: get available cpus
>>>>> [Metropolis-01:24564] hwloc:base:get_available_cpus first time - 
>>>>> filtering cpus
>>>>> [Metropolis-01:24564] hwloc:base: no cpus specified - using root 
>>>>> available cpuset
>>>>> [Metropolis-01:24564] computing locality - getting object at level CORE, 
>>>>> index 1
>>>>> [Metropolis-01:24564] hwloc:base: get available cpus
>>>>> [Metropolis-01:24564] hwloc:base:filter_cpus specified - already done
>>>>> [Metropolis-01:24564] computing locality - shifting up from L1CACHE
>>>>> [Metropolis-01:24564] computing locality - shifting up from L2CACHE
>>>>> [Metropolis-01:24564] computing locality - shifting up from L3CACHE
>>>>> [Metropolis-01:24564] computing locality - filling level SOCKET
>>>>> [Metropolis-01:24564] computing locality - filling level NUMA
>>>>> [Metropolis-01:24564] locality: CL:CU:N:B:Nu:S
>>>>> [Metropolis-01:24565] mca:base:select:(grpcomm) Querying component [bad]
>>>>> [Metropolis-01:24565] mca:base:select:(grpcomm) Query of component [bad] 
>>>>> set priority to 10
>>>>> [Metropolis-01:24565] mca:base:select:(grpcomm) Selected component [bad]
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive start comm
>>>>> [Metropolis-01:24564] mca: base: components_open: Looking for coll 
>>>>> components
>>>>> [Metropolis-01:24564] mca: base: components_open: opening coll components
>>>>> [Metropolis-01:24564] mca: base: components_open: found loaded component 
>>>>> tuned
>>>>> [Metropolis-01:24564] mca: base: components_open: component tuned has no 
>>>>> register function
>>>>> [Metropolis-01:24564] coll:tuned:component_open: done!
>>>>> [Metropolis-01:24564] mca: base: components_open: component tuned open 
>>>>> function successful
>>>>> [Metropolis-01:24564] mca: base: components_open: found loaded component 
>>>>> sm
>>>>> [Metropolis-01:24564] mca: base: components_open: component sm register 
>>>>> function successful
>>>>> [Metropolis-01:24564] mca: base: components_open: component sm has no 
>>>>> open function
>>>>> [Metropolis-01:24564] mca: base: components_open: found loaded component 
>>>>> libnbc
>>>>> [Metropolis-01:24564] mca: base: components_open: component libnbc 
>>>>> register function successful
>>>>> [Metropolis-01:24564] mca: base: components_open: component libnbc open 
>>>>> function successful
>>>>> [Metropolis-01:24564] mca: base: components_open: found loaded component 
>>>>> hierarch
>>>>> [Metropolis-01:24564] mca: base: components_open: component hierarch has 
>>>>> no register function
>>>>> [Metropolis-01:24564] mca: base: components_open: component hierarch open 
>>>>> function successful
>>>>> [Metropolis-01:24564] mca: base: components_open: found loaded component 
>>>>> basic
>>>>> [Metropolis-01:24564] mca: base: components_open: component basic 
>>>>> register function successful
>>>>> [Metropolis-01:24564] mca: base: components_open: component basic has no 
>>>>> open function
>>>>> [Metropolis-01:24564] mca: base: components_open: found loaded component 
>>>>> inter
>>>>> [Metropolis-01:24564] mca: base: components_open: component inter has no 
>>>>> register function
>>>>> [Metropolis-01:24564] mca: base: components_open: component inter open 
>>>>> function successful
>>>>> [Metropolis-01:24564] mca: base: components_open: found loaded component 
>>>>> self
>>>>> [Metropolis-01:24564] mca: base: components_open: component self has no 
>>>>> register function
>>>>> [Metropolis-01:24564] mca: base: components_open: component self open 
>>>>> function successful
>>>>> [Metropolis-01:24565] computing locality - getting object at level CORE, 
>>>>> index 1
>>>>> [Metropolis-01:24565] hwloc:base: get available cpus
>>>>> [Metropolis-01:24565] hwloc:base:get_available_cpus first time - 
>>>>> filtering cpus
>>>>> [Metropolis-01:24565] hwloc:base: no cpus specified - using root 
>>>>> available cpuset
>>>>> [Metropolis-01:24565] hwloc:base: get available cpus
>>>>> [Metropolis-01:24565] hwloc:base:filter_cpus specified - already done
>>>>> [Metropolis-01:24565] computing locality - getting object at level CORE, 
>>>>> index 0
>>>>> [Metropolis-01:24565] computing locality - shifting up from L1CACHE
>>>>> [Metropolis-01:24565] computing locality - shifting up from L2CACHE
>>>>> [Metropolis-01:24565] computing locality - shifting up from L3CACHE
>>>>> [Metropolis-01:24565] computing locality - filling level SOCKET
>>>>> [Metropolis-01:24565] computing locality - filling level NUMA
>>>>> [Metropolis-01:24565] locality: CL:CU:N:B:Nu:S
>>>>> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
>>>>> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0
>>>>> [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO 
>>>>> PARTICIPANTS
>>>>> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0
>>>>> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0
>>>>> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: performing modex
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:pack_modex: reporting 4 
>>>>> entries
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:full:modex: executing 
>>>>> allgather
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering allgather
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad allgather underway
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:modex: modex posted
>>>>> [Metropolis-01:24565] mca: base: components_open: Looking for coll 
>>>>> components
>>>>> [Metropolis-01:24565] mca: base: components_open: opening coll components
>>>>> [Metropolis-01:24565] mca: base: components_open: found loaded component 
>>>>> tuned
>>>>> [Metropolis-01:24565] mca: base: components_open: component tuned has no 
>>>>> register function
>>>>> [Metropolis-01:24565] coll:tuned:component_open: done!
>>>>> [Metropolis-01:24565] mca: base: components_open: component tuned open 
>>>>> function successful
>>>>> [Metropolis-01:24565] mca: base: components_open: found loaded component 
>>>>> sm
>>>>> [Metropolis-01:24565] mca: base: components_open: component sm register 
>>>>> function successful
>>>>> [Metropolis-01:24565] mca: base: components_open: component sm has no 
>>>>> open function
>>>>> [Metropolis-01:24565] mca: base: components_open: found loaded component 
>>>>> libnbc
>>>>> [Metropolis-01:24565] mca: base: components_open: component libnbc 
>>>>> register function successful
>>>>> [Metropolis-01:24565] mca: base: components_open: component libnbc open 
>>>>> function successful
>>>>> [Metropolis-01:24565] mca: base: components_open: found loaded component 
>>>>> hierarch
>>>>> [Metropolis-01:24565] mca: base: components_open: component hierarch has 
>>>>> no register function
>>>>> [Metropolis-01:24565] mca: base: components_open: component hierarch open 
>>>>> function successful
>>>>> [Metropolis-01:24565] mca: base: components_open: found loaded component 
>>>>> basic
>>>>> [Metropolis-01:24565] mca: base: components_open: component basic 
>>>>> register function successful
>>>>> [Metropolis-01:24565] mca: base: components_open: component basic has no 
>>>>> open function
>>>>> [Metropolis-01:24565] mca: base: components_open: found loaded component 
>>>>> inter
>>>>> [Metropolis-01:24565] mca: base: components_open: component inter has no 
>>>>> register function
>>>>> [Metropolis-01:24565] mca: base: components_open: component inter open 
>>>>> function successful
>>>>> [Metropolis-01:24565] mca: base: components_open: found loaded component 
>>>>> self
>>>>> [Metropolis-01:24565] mca: base: components_open: component self has no 
>>>>> register function
>>>>> [Metropolis-01:24565] mca: base: components_open: component self open 
>>>>> function successful
>>>>> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
>>>>> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 0
>>>>> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 0
>>>>> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 0
>>>>> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
>>>>> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 0 LOCALLY COMPLETE - 
>>>>> SENDING TO GLOBAL COLLECTIVE
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon 
>>>>> collective recvd from [[36265,0],0]
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING 
>>>>> COLLECTIVE 0
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM 
>>>>> CONTRIBS: 2
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job 
>>>>> [36265,1] tag 30
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
>>>>> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient 
>>>>> list is empty!
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: performing modex
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:pack_modex: reporting 4 
>>>>> entries
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:full:modex: executing 
>>>>> allgather
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering allgather
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad allgather underway
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:modex: modex posted
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing 
>>>>> collective return for id 0
>>>>> [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 0
>>>>> [Metropolis-01:24564] [[36265,1],0] STORING MODEX DATA
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex 
>>>>> entry for proc [[36265,1],0]
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing 
>>>>> collective return for id 0
>>>>> [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 0
>>>>> [Metropolis-01:24565] [[36265,1],1] STORING MODEX DATA
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex 
>>>>> entry for proc [[36265,1],0]
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries: 
>>>>> adding 4 entries for proc [[36265,1],0]
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:store_modex adding modex 
>>>>> entry for proc [[36265,1],1]
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:update_modex_entries: 
>>>>> adding 4 entries for proc [[36265,1],1]
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries: 
>>>>> adding 4 entries for proc [[36265,1],0]
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:store_modex adding modex 
>>>>> entry for proc [[36265,1],1]
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:update_modex_entries: 
>>>>> adding 4 entries for proc [[36265,1],1]
>>>>> [Metropolis-01:24564] coll:find_available: querying coll component tuned
>>>>> [Metropolis-01:24564] coll:find_available: coll component tuned is 
>>>>> available
>>>>> [Metropolis-01:24565] coll:find_available: querying coll component tuned
>>>>> [Metropolis-01:24565] coll:find_available: coll component tuned is 
>>>>> available
>>>>> [Metropolis-01:24565] coll:find_available: querying coll component sm
>>>>> [Metropolis-01:24564] coll:find_available: querying coll component sm
>>>>> [Metropolis-01:24564] coll:sm:init_query: no other local procs; 
>>>>> disqualifying myself
>>>>> [Metropolis-01:24564] coll:find_available: coll component sm is not 
>>>>> available
>>>>> [Metropolis-01:24564] coll:find_available: querying coll component libnbc
>>>>> [Metropolis-01:24564] coll:find_available: coll component libnbc is 
>>>>> available
>>>>> [Metropolis-01:24564] coll:find_available: querying coll component 
>>>>> hierarch
>>>>> [Metropolis-01:24564] coll:find_available: coll component hierarch is 
>>>>> available
>>>>> [Metropolis-01:24564] coll:find_available: querying coll component basic
>>>>> [Metropolis-01:24564] coll:find_available: coll component basic is 
>>>>> available
>>>>> [Metropolis-01:24565] coll:sm:init_query: no other local procs; 
>>>>> disqualifying myself
>>>>> [Metropolis-01:24565] coll:find_available: coll component sm is not 
>>>>> available
>>>>> [Metropolis-01:24565] coll:find_available: querying coll component libnbc
>>>>> [Metropolis-01:24565] coll:find_available: coll component libnbc is 
>>>>> available
>>>>> [Metropolis-01:24565] coll:find_available: querying coll component 
>>>>> hierarch
>>>>> [Metropolis-01:24565] coll:find_available: coll component hierarch is 
>>>>> available
>>>>> [Metropolis-01:24565] coll:find_available: querying coll component basic
>>>>> [Metropolis-01:24565] coll:find_available: coll component basic is 
>>>>> available
>>>>> [Metropolis-01:24564] coll:find_available: querying coll component inter
>>>>> [Metropolis-01:24564] coll:find_available: coll component inter is 
>>>>> available
>>>>> [Metropolis-01:24564] coll:find_available: querying coll component self
>>>>> [Metropolis-01:24564] coll:find_available: coll component self is 
>>>>> available
>>>>> [Metropolis-01:24565] coll:find_available: querying coll component inter
>>>>> [Metropolis-01:24565] coll:find_available: coll component inter is 
>>>>> available
>>>>> [Metropolis-01:24565] coll:find_available: querying coll component self
>>>>> [Metropolis-01:24565] coll:find_available: coll component self is 
>>>>> available
>>>>> [Metropolis-01:24565] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
>>>>> [Metropolis-01:24564] hwloc:base:get_nbojbs computed data 0 of NUMANode:0
>>>>> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
>>>>> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1
>>>>> [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO 
>>>>> PARTICIPANTS
>>>>> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1
>>>>> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1
>>>>> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
>>>>> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
>>>>> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 1
>>>>> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 1
>>>>> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 1
>>>>> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
>>>>> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 1 LOCALLY COMPLETE - 
>>>>> SENDING TO GLOBAL COLLECTIVE
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon 
>>>>> collective recvd from [[36265,0],0]
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING 
>>>>> COLLECTIVE 1
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM 
>>>>> CONTRIBS: 2
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job 
>>>>> [36265,1] tag 30
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
>>>>> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient 
>>>>> list is empty!
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing 
>>>>> collective return for id 1
>>>>> [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 1
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing 
>>>>> collective return for id 1
>>>>> [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 1
>>>>> [Metropolis-01:24565] coll:base:comm_select: new communicator: 
>>>>> MPI_COMM_WORLD (cid 0)
>>>>> [Metropolis-01:24565] coll:base:comm_select: Checking all available 
>>>>> modules
>>>>> [Metropolis-01:24565] coll:tuned:module_tuned query called
>>>>> [Metropolis-01:24565] coll:base:comm_select: component available: tuned, 
>>>>> priority: 30
>>>>> [Metropolis-01:24565] coll:base:comm_select: component available: libnbc, 
>>>>> priority: 10
>>>>> [Metropolis-01:24565] coll:base:comm_select: component not available: 
>>>>> hierarch
>>>>> [Metropolis-01:24565] coll:base:comm_select: component available: basic, 
>>>>> priority: 10
>>>>> [Metropolis-01:24565] coll:base:comm_select: component not available: 
>>>>> inter
>>>>> [Metropolis-01:24565] coll:base:comm_select: component not available: self
>>>>> [Metropolis-01:24565] coll:tuned:module_init called.
>>>>> [Metropolis-01:24565] coll:tuned:module_init Tuned is in use
>>>>> [Metropolis-01:24565] coll:base:comm_select: new communicator: 
>>>>> MPI_COMM_SELF (cid 1)
>>>>> [Metropolis-01:24565] coll:base:comm_select: Checking all available 
>>>>> modules
>>>>> [Metropolis-01:24564] coll:base:comm_select: new communicator: 
>>>>> MPI_COMM_WORLD (cid 0)
>>>>> [Metropolis-01:24564] coll:base:comm_select: Checking all available 
>>>>> modules
>>>>> [Metropolis-01:24564] coll:tuned:module_tuned query called
>>>>> [Metropolis-01:24564] coll:base:comm_select: component available: tuned, 
>>>>> priority: 30
>>>>> [Metropolis-01:24564] coll:base:comm_select: component available: libnbc, 
>>>>> priority: 10
>>>>> [Metropolis-01:24564] coll:base:comm_select: component not available: 
>>>>> hierarch
>>>>> [Metropolis-01:24564] coll:base:comm_select: component available: basic, 
>>>>> priority: 10
>>>>> [Metropolis-01:24564] coll:base:comm_select: component not available: 
>>>>> inter
>>>>> [Metropolis-01:24564] coll:base:comm_select: component not available: self
>>>>> [Metropolis-01:24564] coll:tuned:module_init called.
>>>>> [Metropolis-01:24565] coll:tuned:module_tuned query called
>>>>> [Metropolis-01:24565] coll:base:comm_select: component not available: 
>>>>> tuned
>>>>> [Metropolis-01:24565] coll:base:comm_select: component available: libnbc, 
>>>>> priority: 10
>>>>> [Metropolis-01:24565] coll:base:comm_select: component not available: 
>>>>> hierarch
>>>>> [Metropolis-01:24565] coll:base:comm_select: component available: basic, 
>>>>> priority: 10
>>>>> [Metropolis-01:24565] coll:base:comm_select: component not available: 
>>>>> inter
>>>>> [Metropolis-01:24565] coll:base:comm_select: component available: self, 
>>>>> priority: 75
>>>>> [Metropolis-01:24564] coll:tuned:module_init Tuned is in use
>>>>> [Metropolis-01:24564] coll:base:comm_select: new communicator: 
>>>>> MPI_COMM_SELF (cid 1)
>>>>> [Metropolis-01:24564] coll:base:comm_select: Checking all available 
>>>>> modules
>>>>> [Metropolis-01:24564] coll:tuned:module_tuned query called
>>>>> [Metropolis-01:24564] coll:base:comm_select: component not available: 
>>>>> tuned
>>>>> [Metropolis-01:24564] coll:base:comm_select: component available: libnbc, 
>>>>> priority: 10
>>>>> [Metropolis-01:24564] coll:base:comm_select: component not available: 
>>>>> hierarch
>>>>> [Metropolis-01:24564] coll:base:comm_select: component available: basic, 
>>>>> priority: 10
>>>>> [Metropolis-01:24564] coll:base:comm_select: component not available: 
>>>>> inter
>>>>> [Metropolis-01:24564] coll:base:comm_select: component available: self, 
>>>>> priority: 75
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad entering barrier
>>>>> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],1]
>>>>> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2
>>>>> [Metropolis-01:24563] [[36265,0],0] ADDING [[36265,1],WILDCARD] TO 
>>>>> PARTICIPANTS
>>>>> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2
>>>>> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2
>>>>> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
>>>>> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE RECVD FROM [[36265,1],0]
>>>>> [Metropolis-01:24563] [[36265,0],0] WORKING COLLECTIVE 2
>>>>> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLLECTIVE 2
>>>>> [Metropolis-01:24563] [[36265,0],0] PROGRESSING COLL id 2
>>>>> [Metropolis-01:24563] [[36265,0],0] ALL LOCAL PROCS CONTRIBUTE 2
>>>>> [Metropolis-01:24563] [[36265,0],0] COLLECTIVE 2 LOCALLY COMPLETE - 
>>>>> SENDING TO GLOBAL COLLECTIVE
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: daemon 
>>>>> collective recvd from [[36265,0],0]
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: WORKING 
>>>>> COLLECTIVE 2
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:base:daemon_coll: NUM 
>>>>> CONTRIBS: 2
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job 
>>>>> [36265,1] tag 30
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
>>>>> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient 
>>>>> list is empty!
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad entering barrier
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:bad barrier underway
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive processing 
>>>>> collective return for id 2
>>>>> [Metropolis-01:24564] [[36265,1],0] CHECKING COLL id 2
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:bad barrier underway
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive processing 
>>>>> collective return for id 2
>>>>> [Metropolis-01:24565] [[36265,1],1] CHECKING COLL id 2
>>>>> [Metropolis-01:24565] coll:tuned:component_close: called
>>>>> [Metropolis-01:24565] coll:tuned:component_close: done!
>>>>> [Metropolis-01:24565] mca: base: close: component tuned closed
>>>>> [Metropolis-01:24565] mca: base: close: unloading component tuned
>>>>> [Metropolis-01:24565] mca: base: close: component libnbc closed
>>>>> [Metropolis-01:24565] mca: base: close: unloading component libnbc
>>>>> [Metropolis-01:24565] mca: base: close: unloading component hierarch
>>>>> [Metropolis-01:24565] mca: base: close: unloading component basic
>>>>> [Metropolis-01:24565] mca: base: close: unloading component inter
>>>>> [Metropolis-01:24565] mca: base: close: unloading component self
>>>>> [Metropolis-01:24565] [[36265,1],1] grpcomm:base:receive stop comm
>>>>> [Metropolis-01:24564] coll:tuned:component_close: called
>>>>> [Metropolis-01:24564] coll:tuned:component_close: done!
>>>>> [Metropolis-01:24564] mca: base: close: component tuned closed
>>>>> [Metropolis-01:24564] mca: base: close: unloading component tuned
>>>>> [Metropolis-01:24564] mca: base: close: component libnbc closed
>>>>> [Metropolis-01:24564] mca: base: close: unloading component libnbc
>>>>> [Metropolis-01:24564] mca: base: close: unloading component hierarch
>>>>> [Metropolis-01:24564] mca: base: close: unloading component basic
>>>>> [Metropolis-01:24564] mca: base: close: unloading component inter
>>>>> [Metropolis-01:24564] mca: base: close: unloading component self
>>>>> [Metropolis-01:24564] [[36265,1],0] grpcomm:base:receive stop comm
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:bad:xcast sent to job 
>>>>> [36265,0] tag 1
>>>>> [Metropolis-01:24563] [[36265,0],0] grpcomm:xcast:recv:send_relay
>>>>> [Metropolis-01:24563] [[36265,0],0] orte:daemon:send_relay - recipient 
>>>>> list is empty!
>>>>> [jarico@Metropolis-01 examples]$
>>>>> 
>>>>> 
>>>>> 
>>>>> El 03/07/2012, a las 21:44, Ralph Castain escribió:
>>>>> 
>>>>>> Interesting - yes, coll sm doesn't think they are on the same node for 
>>>>>> some reason. Try adding -mca grpcomm_base_verbose 5 and let's see why
>>>>>> 
>>>>>> 
>>>>>> On Jul 3, 2012, at 1:24 PM, Juan Antonio Rico Gallego wrote:
>>>>>> 
>>>>>>> The code I run is a simple broadcast.
>>>>>>> 
>>>>>>> When I do not specify components to run, the output is (more verbose):
>>>>>>> 
>>>>>>> [jarico@Metropolis-01 examples]$ 
>>>>>>> /home/jarico/shared/packages/openmpi-cas-dbg/bin/mpiexec --mca 
>>>>>>> mca_base_verbose 100 --mca mca_coll_base_output 100  --mca 
>>>>>>> coll_sm_priority 99 -mca hwloc_base_verbose 90 --display-map --mca 
>>>>>>> mca_verbose 100 --mca mca_base_verbose 100 --mca coll_base_verbose 100 
>>>>>>> -n 2 ./bmem
>>>>>>> [Metropolis-01:24490] mca: base: components_open: Looking for hwloc 
>>>>>>> components
>>>>>>> [Metropolis-01:24490] mca: base: components_open: opening hwloc 
>>>>>>> components
>>>>>>> [Metropolis-01:24490] mca: base: components_open: found loaded 
>>>>>>> component hwloc142
>>>>>>> [Metropolis-01:24490] mca: base: components_open: component hwloc142 
>>>>>>> has no register function
>>>>>>> [Metropolis-01:24490] mca: base: components_open: component hwloc142 
>>>>>>> has no open function
>>>>>>> [Metropolis-01:24490] hwloc:base:get_topology
>>>>>>> [Metropolis-01:24490] hwloc:base: no cpus specified - using root 
>>>>>>> available cpuset
>>>>>>> 
>>>>>>> ========================   JOB MAP   ========================
>>>>>>> 
>>>>>>> Data for node: Metropolis-01 Num procs: 2
>>>>>>>    Process OMPI jobid: [36336,1] App: 0 Process rank: 0
>>>>>>>    Process OMPI jobid: [36336,1] App: 0 Process rank: 1
>>>>>>> 
>>>>>>> =============================================================
>>>>>>> [Metropolis-01:24491] mca: base: components_open: Looking for hwloc 
>>>>>>> components
>>>>>>> [Metropolis-01:24491] mca: base: components_open: opening hwloc 
>>>>>>> components
>>>>>>> [Metropolis-01:24491] mca: base: components_open: found loaded 
>>>>>>> component hwloc142
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component hwloc142 
>>>>>>> has no register function
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component hwloc142 
>>>>>>> has no open function
>>>>>>> [Metropolis-01:24492] mca: base: components_open: Looking for hwloc 
>>>>>>> components
>>>>>>> [Metropolis-01:24492] mca: base: components_open: opening hwloc 
>>>>>>> components
>>>>>>> [Metropolis-01:24492] mca: base: components_open: found loaded 
>>>>>>> component hwloc142
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component hwloc142 
>>>>>>> has no register function
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component hwloc142 
>>>>>>> has no open function
>>>>>>> [Metropolis-01:24491] locality: CL:CU:N:B
>>>>>>> [Metropolis-01:24491] hwloc:base: get available cpus
>>>>>>> [Metropolis-01:24491] hwloc:base:get_available_cpus first time - 
>>>>>>> filtering cpus
>>>>>>> [Metropolis-01:24491] hwloc:base: no cpus specified - using root 
>>>>>>> available cpuset
>>>>>>> [Metropolis-01:24491] hwloc:base:get_available_cpus root object
>>>>>>> [Metropolis-01:24491] mca: base: components_open: Looking for coll 
>>>>>>> components
>>>>>>> [Metropolis-01:24491] mca: base: components_open: opening coll 
>>>>>>> components
>>>>>>> [Metropolis-01:24491] mca: base: components_open: found loaded 
>>>>>>> component tuned
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component tuned has 
>>>>>>> no register function
>>>>>>> [Metropolis-01:24491] coll:tuned:component_open: done!
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component tuned open 
>>>>>>> function successful
>>>>>>> [Metropolis-01:24491] mca: base: components_open: found loaded 
>>>>>>> component sm
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component sm register 
>>>>>>> function successful
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component sm has no 
>>>>>>> open function
>>>>>>> [Metropolis-01:24491] mca: base: components_open: found loaded 
>>>>>>> component libnbc
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component libnbc 
>>>>>>> register function successful
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component libnbc open 
>>>>>>> function successful
>>>>>>> [Metropolis-01:24491] mca: base: components_open: found loaded 
>>>>>>> component hierarch
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component hierarch 
>>>>>>> has no register function
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component hierarch 
>>>>>>> open function successful
>>>>>>> [Metropolis-01:24491] mca: base: components_open: found loaded 
>>>>>>> component basic
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component basic 
>>>>>>> register function successful
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component basic has 
>>>>>>> no open function
>>>>>>> [Metropolis-01:24491] mca: base: components_open: found loaded 
>>>>>>> component inter
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component inter has 
>>>>>>> no register function
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component inter open 
>>>>>>> function successful
>>>>>>> [Metropolis-01:24491] mca: base: components_open: found loaded 
>>>>>>> component self
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component self has no 
>>>>>>> register function
>>>>>>> [Metropolis-01:24491] mca: base: components_open: component self open 
>>>>>>> function successful
>>>>>>> [Metropolis-01:24492] locality: CL:CU:N:B
>>>>>>> [Metropolis-01:24492] hwloc:base: get available cpus
>>>>>>> [Metropolis-01:24492] hwloc:base:get_available_cpus first time - 
>>>>>>> filtering cpus
>>>>>>> [Metropolis-01:24492] hwloc:base: no cpus specified - using root 
>>>>>>> available cpuset
>>>>>>> [Metropolis-01:24492] hwloc:base:get_available_cpus root object
>>>>>>> [Metropolis-01:24492] mca: base: components_open: Looking for coll 
>>>>>>> components
>>>>>>> [Metropolis-01:24492] mca: base: components_open: opening coll 
>>>>>>> components
>>>>>>> [Metropolis-01:24492] mca: base: components_open: found loaded 
>>>>>>> component tuned
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component tuned has 
>>>>>>> no register function
>>>>>>> [Metropolis-01:24492] coll:tuned:component_open: done!
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component tuned open 
>>>>>>> function successful
>>>>>>> [Metropolis-01:24492] mca: base: components_open: found loaded 
>>>>>>> component sm
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component sm register 
>>>>>>> function successful
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component sm has no 
>>>>>>> open function
>>>>>>> [Metropolis-01:24492] mca: base: components_open: found loaded 
>>>>>>> component libnbc
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component libnbc 
>>>>>>> register function successful
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component libnbc open 
>>>>>>> function successful
>>>>>>> [Metropolis-01:24492] mca: base: components_open: found loaded 
>>>>>>> component hierarch
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component hierarch 
>>>>>>> has no register function
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component hierarch 
>>>>>>> open function successful
>>>>>>> [Metropolis-01:24492] mca: base: components_open: found loaded 
>>>>>>> component basic
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component basic 
>>>>>>> register function successful
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component basic has 
>>>>>>> no open function
>>>>>>> [Metropolis-01:24492] mca: base: components_open: found loaded 
>>>>>>> component inter
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component inter has 
>>>>>>> no register function
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component inter open 
>>>>>>> function successful
>>>>>>> [Metropolis-01:24492] mca: base: components_open: found loaded 
>>>>>>> component self
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component self has no 
>>>>>>> register function
>>>>>>> [Metropolis-01:24492] mca: base: components_open: component self open 
>>>>>>> function successful
>>>>>>> [Metropolis-01:24491] coll:find_available: querying coll component tuned
>>>>>>> [Metropolis-01:24491] coll:find_available: coll component tuned is 
>>>>>>> available
>>>>>>> [Metropolis-01:24491] coll:find_available: querying coll component sm
>>>>>>> [Metropolis-01:24491] coll:sm:init_query: no other local procs; 
>>>>>>> disqualifying myself
>>>>>>> [Metropolis-01:24491] coll:find_available: coll component sm is not 
>>>>>>> available
>>>>>>> [Metropolis-01:24491] coll:find_available: querying coll component 
>>>>>>> libnbc
>>>>>>> [Metropolis-01:24491] coll:find_available: coll component libnbc is 
>>>>>>> available
>>>>>>> [Metropolis-01:24491] coll:find_available: querying coll component 
>>>>>>> hierarch
>>>>>>> [Metropolis-01:24491] coll:find_available: coll component hierarch is 
>>>>>>> available
>>>>>>> [Metropolis-01:24491] coll:find_available: querying coll component basic
>>>>>>> [Metropolis-01:24491] coll:find_available: coll component basic is 
>>>>>>> available
>>>>>>> [Metropolis-01:24491] coll:find_available: querying coll component inter
>>>>>>> [Metropolis-01:24492] coll:find_available: querying coll component tuned
>>>>>>> [Metropolis-01:24492] coll:find_available: coll component tuned is 
>>>>>>> available
>>>>>>> [Metropolis-01:24492] coll:find_available: querying coll component sm
>>>>>>> [Metropolis-01:24492] coll:sm:init_query: no other local procs; 
>>>>>>> disqualifying myself
>>>>>>> [Metropolis-01:24492] coll:find_available: coll component sm is not 
>>>>>>> available
>>>>>>> [Metropolis-01:24492] coll:find_available: querying coll component 
>>>>>>> libnbc
>>>>>>> [Metropolis-01:24492] coll:find_available: coll component libnbc is 
>>>>>>> available
>>>>>>> [Metropolis-01:24492] coll:find_available: querying coll component 
>>>>>>> hierarch
>>>>>>> [Metropolis-01:24492] coll:find_available: coll component hierarch is 
>>>>>>> available
>>>>>>> [Metropolis-01:24492] coll:find_available: querying coll component basic
>>>>>>> [Metropolis-01:24492] coll:find_available: coll component basic is 
>>>>>>> available
>>>>>>> [Metropolis-01:24492] coll:find_available: querying coll component inter
>>>>>>> [Metropolis-01:24492] coll:find_available: coll component inter is 
>>>>>>> available
>>>>>>> [Metropolis-01:24492] coll:find_available: querying coll component self
>>>>>>> [Metropolis-01:24492] coll:find_available: coll component self is 
>>>>>>> available
>>>>>>> [Metropolis-01:24491] coll:find_available: coll component inter is 
>>>>>>> available
>>>>>>> [Metropolis-01:24491] coll:find_available: querying coll component self
>>>>>>> [Metropolis-01:24491] coll:find_available: coll component self is 
>>>>>>> available
>>>>>>> [Metropolis-01:24492] hwloc:base:get_nbojbs computed data 0 of 
>>>>>>> NUMANode:0
>>>>>>> [Metropolis-01:24491] hwloc:base:get_nbojbs computed data 0 of 
>>>>>>> NUMANode:0
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: new communicator: 
>>>>>>> MPI_COMM_WORLD (cid 0)
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: Checking all available 
>>>>>>> modules
>>>>>>> [Metropolis-01:24491] coll:tuned:module_tuned query called
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: component available: 
>>>>>>> tuned, priority: 30
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: component available: 
>>>>>>> libnbc, priority: 10
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: component not available: 
>>>>>>> hierarch
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: component available: 
>>>>>>> basic, priority: 10
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: component not available: 
>>>>>>> inter
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: component not available: 
>>>>>>> self
>>>>>>> [Metropolis-01:24491] coll:tuned:module_init called.
>>>>>>> [Metropolis-01:24491] coll:tuned:module_init Tuned is in use
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: new communicator: 
>>>>>>> MPI_COMM_SELF (cid 1)
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: Checking all available 
>>>>>>> modules
>>>>>>> [Metropolis-01:24491] coll:tuned:module_tuned query called
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: component not available: 
>>>>>>> tuned
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: component available: 
>>>>>>> libnbc, priority: 10
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: component not available: 
>>>>>>> hierarch
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: component available: 
>>>>>>> basic, priority: 10
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: component not available: 
>>>>>>> inter
>>>>>>> [Metropolis-01:24491] coll:base:comm_select: component available: self, 
>>>>>>> priority: 75
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: new communicator: 
>>>>>>> MPI_COMM_WORLD (cid 0)
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: Checking all available 
>>>>>>> modules
>>>>>>> [Metropolis-01:24492] coll:tuned:module_tuned query called
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: component available: 
>>>>>>> tuned, priority: 30
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: component available: 
>>>>>>> libnbc, priority: 10
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: component not available: 
>>>>>>> hierarch
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: component available: 
>>>>>>> basic, priority: 10
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: component not available: 
>>>>>>> inter
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: component not available: 
>>>>>>> self
>>>>>>> [Metropolis-01:24492] coll:tuned:module_init called.
>>>>>>> [Metropolis-01:24492] coll:tuned:module_init Tuned is in use
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: new communicator: 
>>>>>>> MPI_COMM_SELF (cid 1)
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: Checking all available 
>>>>>>> modules
>>>>>>> [Metropolis-01:24492] coll:tuned:module_tuned query called
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: component not available: 
>>>>>>> tuned
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: component available: 
>>>>>>> libnbc, priority: 10
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: component not available: 
>>>>>>> hierarch
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: component available: 
>>>>>>> basic, priority: 10
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: component not available: 
>>>>>>> inter
>>>>>>> [Metropolis-01:24492] coll:base:comm_select: component available: self, 
>>>>>>> priority: 75
>>>>>>> [Metropolis-01:24491] coll:tuned:component_close: called
>>>>>>> [Metropolis-01:24491] coll:tuned:component_close: done!
>>>>>>> [Metropolis-01:24492] coll:tuned:component_close: called
>>>>>>> [Metropolis-01:24492] coll:tuned:component_close: done!
>>>>>>> [Metropolis-01:24492] mca: base: close: component tuned closed
>>>>>>> [Metropolis-01:24492] mca: base: close: unloading component tuned
>>>>>>> [Metropolis-01:24492] mca: base: close: component libnbc closed
>>>>>>> [Metropolis-01:24492] mca: base: close: unloading component libnbc
>>>>>>> [Metropolis-01:24492] mca: base: close: unloading component hierarch
>>>>>>> [Metropolis-01:24492] mca: base: close: unloading component basic
>>>>>>> [Metropolis-01:24492] mca: base: close: unloading component inter
>>>>>>> [Metropolis-01:24492] mca: base: close: unloading component self
>>>>>>> [Metropolis-01:24491] mca: base: close: component tuned closed
>>>>>>> [Metropolis-01:24491] mca: base: close: unloading component tuned
>>>>>>> [Metropolis-01:24491] mca: base: close: component libnbc closed
>>>>>>> [Metropolis-01:24491] mca: base: close: unloading component libnbc
>>>>>>> [Metropolis-01:24491] mca: base: close: unloading component hierarch
>>>>>>> [Metropolis-01:24491] mca: base: close: unloading component basic
>>>>>>> [Metropolis-01:24491] mca: base: close: unloading component inter
>>>>>>> [Metropolis-01:24491] mca: base: close: unloading component self
>>>>>>> [jarico@Metropolis-01 examples]$
>>>>>>> 
>>>>>>> 
>>>>>>> SM is not load because it detects no other processes in the same 
>>>>>>> machine:
>>>>>>> 
>>>>>>> [Metropolis-01:24491] coll:sm:init_query: no other local procs; 
>>>>>>> disqualifying myself
>>>>>>> 
>>>>>>> The machine is a multicore machine with 8 cores.
>>>>>>> 
>>>>>>> I need to run SM component code, and I suppose that raising priority it 
>>>>>>> will be the component selected when problem is solved.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> El 03/07/2012, a las 21:01, Jeff Squyres escribió:
>>>>>>> 
>>>>>>>> The issue is that the "sm" coll component only implements a few of the 
>>>>>>>> MPI collective operations.  It is usually mixed at run-time with other 
>>>>>>>> coll components to fill out the rest of the MPI collective operations.
>>>>>>>> 
>>>>>>>> So what is happening is that OMPI is determining that it doesn't have 
>>>>>>>> implementations of all the MPI collective operations and aborting.
>>>>>>>> 
>>>>>>>> You shouldn't need to manually select your coll module -- OMPI should 
>>>>>>>> automatically select the right collective module for you.  E.g., if 
>>>>>>>> all procs are local on a single machine and sm has a matching 
>>>>>>>> implementation for that MPI collective operation, it'll be used.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Jul 3, 2012, at 2:48 PM, Juan Antonio Rico Gallego wrote:
>>>>>>>> 
>>>>>>>>> Output is:
>>>>>>>>> 
>>>>>>>>> [Metropolis-01:15355] hwloc:base:get_topology
>>>>>>>>> [Metropolis-01:15355] hwloc:base: no cpus specified - using root 
>>>>>>>>> available cpuset
>>>>>>>>> 
>>>>>>>>> ========================   JOB MAP   ========================
>>>>>>>>> 
>>>>>>>>> Data for node: Metropolis-01       Num procs: 2
>>>>>>>>>  Process OMPI jobid: [59809,1] App: 0 Process rank: 0
>>>>>>>>>  Process OMPI jobid: [59809,1] App: 0 Process rank: 1
>>>>>>>>> 
>>>>>>>>> =============================================================
>>>>>>>>> [Metropolis-01:15356] locality: CL:CU:N:B
>>>>>>>>> [Metropolis-01:15356] hwloc:base: get available cpus
>>>>>>>>> [Metropolis-01:15356] hwloc:base:get_available_cpus first time - 
>>>>>>>>> filtering cpus
>>>>>>>>> [Metropolis-01:15356] hwloc:base: no cpus specified - using root 
>>>>>>>>> available cpuset
>>>>>>>>> [Metropolis-01:15356] hwloc:base:get_available_cpus root object
>>>>>>>>> [Metropolis-01:15357] locality: CL:CU:N:B
>>>>>>>>> [Metropolis-01:15357] hwloc:base: get available cpus
>>>>>>>>> [Metropolis-01:15357] hwloc:base:get_available_cpus first time - 
>>>>>>>>> filtering cpus
>>>>>>>>> [Metropolis-01:15357] hwloc:base: no cpus specified - using root 
>>>>>>>>> available cpuset
>>>>>>>>> [Metropolis-01:15357] hwloc:base:get_available_cpus root object
>>>>>>>>> [Metropolis-01:15356] hwloc:base:get_nbojbs computed data 0 of 
>>>>>>>>> NUMANode:0
>>>>>>>>> [Metropolis-01:15357] hwloc:base:get_nbojbs computed data 0 of 
>>>>>>>>> NUMANode:0
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Juan A. Rico
>>>>>>>>> _______________________________________________
>>>>>>>>> devel mailing list
>>>>>>>>> de...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Jeff Squyres
>>>>>>>> jsquy...@cisco.com
>>>>>>>> For corporate legal information go to: 
>>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> devel mailing list
>>>>>>>> de...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> devel mailing list
>>>>>>> de...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> de...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> de...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> 
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> de...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> 
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to