CM is not being selected for TCP - you specified TCP for the BTLs, but that 
assumes that a BTL will be selected. You obviously have something in your 
system that is supported by an MTL, and that will always be selected before a 
BTL.


> On Apr 28, 2016, at 8:22 PM, dpchoudh . <dpcho...@gmail.com> wrote:
> 
> Hello Gilles
> 
> You are absolutely right:
> 
> 1. Adding --mca pml_base_verbose 100 does show that it is the cm PML that is 
> being picked by default (even for TCP)
> 2. Adding --mca pml ob1 does cause add_procs() and related BTL friends to be 
> invoked.
> 
> 
> With a command line of
> 
> mpirun -np 2 -hostfile ~/hostfile -mca btl self,tcp  -mca btl_base_verbose 
> 100 -mca pml_base_verbose 100 ./mpitest
> 
> The output shows (among many other lines) the following:
> 
> [smallMPI:49178] select: init returned priority 30
> [smallMPI:49178] select: initializing pml component ob1
> [smallMPI:49178] select: init returned priority 20
> [smallMPI:49178] select: component v not in the include list
> [smallMPI:49178] selected cm best priority 30
> [smallMPI:49178] select: component ob1 not selected / finalized
> [smallMPI:49178] select: component cm selected
> 
> Which shows that the cm PML was selected. Replacing 'tcp' above with 'openib' 
> shows very similar results. (The openib BTL methods are not invoked, either)
> 
> However, I was under the impression that the CM PML can only handle MTLs (and 
> ob1 can only handle BTLs). So why is cm being selected for TCP?
> 
> Thank you
> Durga
> 
> 
> 
> The surgeon general advises you to eat right, exercise regularly and quit 
> ageing.
> 
> On Thu, Apr 28, 2016 at 2:34 AM, Gilles Gouaillardet <gil...@rist.or.jp 
> <mailto:gil...@rist.or.jp>> wrote:
> the add_procs subroutine of the btl should be called.
> 
> /* i added a printf in mca_btl_tcp_add_procs and it *is* invoked */
> 
> can you try again with --mca pml ob1 --mca pml_base_verbose 100 ?
> 
> maybe the add_procs subroutine is not invoked because openmpi uses cm instead 
> of ob1
> 
> 
> Cheers,
> 
> 
> Gilles
> 
> On 4/28/2016 3:07 PM, dpchoudh . wrote:
>> Hello all
>> 
>> I am struggling with this issue for last few days and thought it would be 
>> prudent to ask for help from people who have way more experience than I do.
>> 
>> There are two questions, interrelated in my mind, but may not be so in 
>> reality. Question 2 is the issue I am struggling with, and question 1 sort 
>> of leads to it.
>> 
>> 1. I see that both in openib and tcp BTL (the two kind of hardware I have 
>> access to) a modex send happens, but a matching modex receive never happens. 
>> Is it because of some           kind of optimization? (In my case, both IP 
>> NICs are in the same IP subnet and both IB NICs are in the same IB subnet) 
>> Or am I not understanding something? How do the processes figure out their 
>> peer information without a modex receive?
>> 
>> The place in code where the modex receive is called is in btl_add_procs(). 
>> However, it looks like in both the above BTLs, this method is never called. 
>> Is that expected?
>> 
>> 2. This is the real question is this:
>> I am writing a BTL for a proprietary RDMA NIC (named 'lf' in the code) that 
>> has no routing capability in protocol, and hence no concept of subnets. An 
>> HCA simply needs to be plugged in to the switch and it can see the whole 
>> network. However, there is a VLAN like partition (similar to IB partitions)
>> Given this (and as a first cut, every node is in the same partition, so even 
>> this complexity is eliminated), there is not much use for a modex exchange, 
>> but I added one anyway just with the partition key.
>> 
>> What I see is that the component open, register and init are all successful, 
>> but r2 bml still does not choose this network and thus OMPI aborts because 
>> of lack of full reachability.
>> 
>> This is my command line:
>> sudo /usr/local/bin/mpirun --allow-run-as-root -hostfile ~/hostfile -np 2 
>> -mca btl self,lf -mca btl_base_verbose 100 -mca bml_base_verbose 100 
>> ./mpitest
>> 
>> ('mpitest' is a trivial 'hello world' program plus ONE MPI_Send()/MPI_Recv() 
>> to test in-band communication. The sudo is required because currently the 
>> driver requires root permission; I was told that this will be fixed. The 
>> hostfile has 2 hosts, named b-2 and b-3, with back-to-back connection on 
>> this 'lf' HCA)
>> 
>> The output of this command is as follows; I have added my comments to 
>> explain it a bit.
>> 
>> <Output from OMPI logging mechanism>
>> [b-2:21062] mca: base: components_register: registering framework bml 
>> components
>> [b-2:21062] mca: base: components_register: found loaded component r2
>> [b-2:21062] mca: base: components_register: component r2 register function 
>> successful
>> [b-2:21062] mca: base: components_open: opening bml components
>> [b-2:21062] mca: base: components_open: found loaded component r2
>> [b-2:21062] mca: base: components_open: component r2 open function successful
>> [b-2:21062] mca: base: components_register: registering framework btl 
>> components
>> [b-2:21062] mca: base: components_register: found loaded component self
>> [b-2:21062] mca: base: components_register: component self register function 
>> successful
>> [b-2:21062] mca: base: components_register: found loaded component lf
>> [b-2:21062] mca: base: components_register: component lf register function 
>> successful
>> [b-2:21062] mca: base: components_open: opening btl components
>> [b-2:21062] mca: base: components_open: found loaded component self
>> [b-2:21062] mca: base: components_open: component self open function 
>> successful
>> [b-2:21062] mca: base: components_open: found loaded component lf
>> 
>> <Debugging output from the HCA driver>
>> lf_group_lib.c:442: _lf_open: _lf_open("MPI_0",0x842,0x1b6,4096,0)
>> 
>> <Output from OMPI logging mechanism, continued>
>> [b-2:21062] mca: base: components_open: component lf open function successful
>> [b-2:21062] select: initializing btl component self
>> [b-2:21062] select: init of component self returned success
>> [b-2:21062] select: initializing btl component lf
>> 
>> <Debugging output from the HCA driver>
>> Created group on b-2
>> 
>> <Output from OMPI logging mechanism, continued>
>> [b-2:21062] select: init of component lf returned success
>> [b-3:07672] mca: base: components_register: registering framework bml 
>> components
>> [b-3:07672] mca: base: components_register: found loaded component r2
>> [b-3:07672] mca: base: components_register: component r2 register function 
>> successful
>> [b-3:07672] mca: base: components_open: opening bml components
>> [b-3:07672] mca: base: components_open: found loaded component r2
>> [b-3:07672] mca: base: components_open: component r2 open function successful
>> [b-3:07672] mca: base: components_register: registering framework btl 
>> components
>> [b-3:07672] mca: base: components_register: found loaded component self
>> [b-3:07672] mca: base: components_register: component self register function 
>> successful
>> [b-3:07672] mca: base: components_register: found loaded component lf
>> [b-3:07672] mca: base: components_register: component lf register function 
>> successful
>> [b-3:07672] mca: base: components_open: opening btl components
>> [b-3:07672] mca: base: components_open: found loaded component self
>> [b-3:07672] mca: base: components_open: component self open function 
>> successful
>> [b-3:07672] mca: base: components_open: found loaded component lf
>> [b-3:07672] mca: base: components_open: component lf open function successful
>> [b-3:07672] select: initializing btl component self
>> [b-3:07672] select: init of component self returned success
>> [b-3:07672] select: initializing btl component lf
>> 
>> <Debugging output from the HCA driver>
>> lf_group_lib.c:442: _lf_open: _lf_open("MPI_0",0x842,0x1b6,4096,0)
>> Created group on b-3
>> 
>> <Output from OMPI logging mechanism, continued>
>> [b-3:07672] select: init of component lf returned success
>> [b-2:21062] mca: bml: Using self btl for send to [[6866,1],0] on node b-2
>> [b-3:07672] mca: bml: Using self btl for send to [[6866,1],1] on node b-3
>> 
>> <Output from the 'mpitest' MPI program: out-of-band-I/O>
>> Hello from b-2
>> The world has 2 nodes
>> My rank is 0
>> Hello from b-3
>> 
>> <Output frm OMPI>
>> --------------------------------------------------------------------------
>> At least one pair of MPI processes are unable to reach each other for
>> MPI communications.  This means that no Open MPI device has indicated
>> that it can be used to communicate between these processes.  This is
>> an error; Open MPI requires that all MPI processes be able to reach
>> each other.  This error can sometimes be the result of forgetting to
>> specify the "self" BTL.
>> 
>>   Process 1 ([[6866,1],0]) is on host: b-2
>>   Process 2 ([[6866,1],1]) is on host: 10.4.70.12
>>   BTLs attempted: self
>> 
>> Your MPI job is now going to abort; sorry.
>> --------------------------------------------------------------------------
>> 
>> <Output from the 'mpitest' MPI program: out-of-band-I/O, continued>
>> The world has 2 nodes
>> My rank is 1
>> 
>> <Output from OMPI logging mechanism, continued>
>> [b-2:21062] *** An error occurred in MPI_Send
>> [b-2:21062] *** reported by process [140385751007233,21474836480]
>> [b-2:21062] *** on communicator MPI_COMM_WORLD
>> [b-2:21062] *** MPI_ERR_INTERN: internal error
>> [b-2:21062] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
>> now abort,
>> [b-2:21062] ***    and potentially your MPI job)
>> [durga@b-2 ~]$
>> 
>> As you can see, the lf network is not being chosen for communication. 
>> Without a modex exchange, how can that happen? Or, in a nutshell, what do I 
>> need to do?
>> 
>> Thanks a lot in advance
>> Durga
>> 
>> 
>> 1% of the executables have 99% of CPU privilege!
>> Userspace code! Unite!! Occupy the kernel!!!
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <mailto:de...@open-mpi.org>
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel 
>> <https://www.open-mpi.org/mailman/listinfo.cgi/devel>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2016/04/18827.php 
>> <http://www.open-mpi.org/community/lists/devel/2016/04/18827.php>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <https://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18828.php 
> <http://www.open-mpi.org/community/lists/devel/2016/04/18828.php>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2016/04/18835.php

Reply via email to