my basic understanding is that ob1 works with btl, and cm works with mtl
(please someone corrects me if I am wrong)
an other way to put this is cm cannot use the tcp btl.

so I can only guess one mtl (PSM ?) is available, and so cm is preferred
over ob1.

what if you
mpirun --mca mtl ^psm ...
is cm selected over ob1 ?

note PSM does not disqualify itself if there is no link, and this is
now being investigated at intel.

Cheers,

Gilles

On Friday, April 29, 2016, dpchoudh . <dpcho...@gmail.com> wrote:

> Hello Gilles
>
> You are absolutely right:
>
> 1. Adding --mca pml_base_verbose 100 does show that it is the cm PML that
> is being picked by default (even for TCP)
> 2. Adding --mca pml ob1 does cause add_procs() and related BTL friends to
> be invoked.
>
>
> With a command line of
>
> mpirun -np 2 -hostfile ~/hostfile -mca btl self,tcp  -mca btl_base_verbose
> 100 -mca pml_base_verbose 100 ./mpitest
>
> The output shows (among many other lines) the following:
>
> [smallMPI:49178] select: init returned priority 30
> [smallMPI:49178] select: initializing pml component ob1
> [smallMPI:49178] select: init returned priority 20
> [smallMPI:49178] select: component v not in the include list
> [smallMPI:49178] selected cm best priority 30
>
> *[smallMPI:49178] select: component ob1 not selected /
> finalized[smallMPI:49178] select: component cm selected*
>
> Which shows that the cm PML was selected. Replacing 'tcp' above with
> 'openib' shows very similar results. (The openib BTL methods are not
> invoked, either)
>
> However, I was under the impression that the CM PML can only handle MTLs
> (and ob1 can only handle BTLs). So why is cm being selected for TCP?
>
> Thank you
> Durga
>
>
>
> The surgeon general advises you to eat right, exercise regularly and quit
> ageing.
>
> On Thu, Apr 28, 2016 at 2:34 AM, Gilles Gouaillardet <gil...@rist.or.jp
> <javascript:_e(%7B%7D,'cvml','gil...@rist.or.jp');>> wrote:
>
>> the add_procs subroutine of the btl should be called.
>>
>> /* i added a printf in mca_btl_tcp_add_procs and it *is* invoked */
>>
>> can you try again with --mca pml ob1 --mca pml_base_verbose 100 ?
>>
>> maybe the add_procs subroutine is not invoked because openmpi uses cm
>> instead of ob1
>>
>>
>> Cheers,
>>
>>
>> Gilles
>>
>> On 4/28/2016 3:07 PM, dpchoudh . wrote:
>>
>> Hello all
>>
>> I am struggling with this issue for last few days and thought it would be
>> prudent to ask for help from people who have way more experience than I do.
>>
>> There are two questions, interrelated in my mind, but may not be so in
>> reality. Question 2 is the issue I am struggling with, and question 1 sort
>> of leads to it.
>>
>> 1. I see that both in openib and tcp BTL (the two kind of hardware I have
>> access to) a modex send happens, but a matching modex receive never
>> happens. Is it because of some kind of optimization? (In my case, both IP
>> NICs are in the same IP subnet and both IB NICs are in the same IB subnet)
>> Or am I not understanding something? How do the processes figure out their
>> peer information without a modex receive?
>>
>> The place in code where the modex receive is called is in
>> btl_add_procs(). However, it looks like in both the above BTLs, this method
>> is never called. Is that expected?
>>
>> 2. This is the real question is this:
>> I am writing a BTL for a proprietary RDMA NIC (named 'lf' in the code)
>> that has no routing capability in protocol, and hence no concept of
>> subnets. An HCA simply needs to be plugged in to the switch and it can see
>> the whole network. However, there is a VLAN like partition (similar to IB
>> partitions)
>> Given this (and as a first cut, every node is in the same partition, so
>> even this complexity is eliminated), there is not much use for a modex
>> exchange, but I added one anyway just with the partition key.
>>
>> What I see is that the component open, register and init are all
>> successful, but r2 bml still does not choose this network and thus OMPI
>> aborts because of lack of full reachability.
>>
>> This is my command line:
>> sudo /usr/local/bin/mpirun --allow-run-as-root -hostfile ~/hostfile -np 2
>> -mca btl self,lf -mca btl_base_verbose 100 -mca bml_base_verbose 100
>> ./mpitest
>>
>> ('mpitest' is a trivial 'hello world' program plus ONE
>> MPI_Send()/MPI_Recv() to test in-band communication. The sudo is required
>> because currently the driver requires root permission; I was told that this
>> will be fixed. The hostfile has 2 hosts, named b-2 and b-3, with
>> back-to-back connection on this 'lf' HCA)
>>
>> The output of this command is as follows; I have added my comments to
>> explain it a bit.
>>
>> <Output from OMPI logging mechanism>
>> [b-2:21062] mca: base: components_register: registering framework bml
>> components
>> [b-2:21062] mca: base: components_register: found loaded component r2
>> [b-2:21062] mca: base: components_register: component r2 register
>> function successful
>> [b-2:21062] mca: base: components_open: opening bml components
>> [b-2:21062] mca: base: components_open: found loaded component r2
>> [b-2:21062] mca: base: components_open: component r2 open function
>> successful
>> [b-2:21062] mca: base: components_register: registering framework btl
>> components
>> [b-2:21062] mca: base: components_register: found loaded component self
>> [b-2:21062] mca: base: components_register: component self register
>> function successful
>> [b-2:21062] mca: base: components_register: found loaded component lf
>> [b-2:21062] mca: base: components_register: component lf register
>> function successful
>> [b-2:21062] mca: base: components_open: opening btl components
>> [b-2:21062] mca: base: components_open: found loaded component self
>> [b-2:21062] mca: base: components_open: component self open function
>> successful
>> [b-2:21062] mca: base: components_open: found loaded component lf
>>
>> <Debugging output from the HCA driver>
>> lf_group_lib.c:442: _lf_open: _lf_open("MPI_0",0x842,0x1b6,4096,0)
>>
>> <Output from OMPI logging mechanism, continued>
>> [b-2:21062] mca: base: components_open: component lf open function
>> successful
>> [b-2:21062] select: initializing btl component self
>> [b-2:21062] select: init of component self returned success
>> [b-2:21062] select: initializing btl component lf
>>
>> <Debugging output from the HCA driver>
>> Created group on b-2
>>
>> <Output from OMPI logging mechanism, continued>
>> [b-2:21062] select: init of component lf returned success
>> [b-3:07672] mca: base: components_register: registering framework bml
>> components
>> [b-3:07672] mca: base: components_register: found loaded component r2
>> [b-3:07672] mca: base: components_register: component r2 register
>> function successful
>> [b-3:07672] mca: base: components_open: opening bml components
>> [b-3:07672] mca: base: components_open: found loaded component r2
>> [b-3:07672] mca: base: components_open: component r2 open function
>> successful
>> [b-3:07672] mca: base: components_register: registering framework btl
>> components
>> [b-3:07672] mca: base: components_register: found loaded component self
>> [b-3:07672] mca: base: components_register: component self register
>> function successful
>> [b-3:07672] mca: base: components_register: found loaded component lf
>> [b-3:07672] mca: base: components_register: component lf register
>> function successful
>> [b-3:07672] mca: base: components_open: opening btl components
>> [b-3:07672] mca: base: components_open: found loaded component self
>> [b-3:07672] mca: base: components_open: component self open function
>> successful
>> [b-3:07672] mca: base: components_open: found loaded component lf
>> [b-3:07672] mca: base: components_open: component lf open function
>> successful
>> [b-3:07672] select: initializing btl component self
>> [b-3:07672] select: init of component self returned success
>> [b-3:07672] select: initializing btl component lf
>>
>> <Debugging output from the HCA driver>
>> lf_group_lib.c:442: _lf_open: _lf_open("MPI_0",0x842,0x1b6,4096,0)
>> Created group on b-3
>>
>> <Output from OMPI logging mechanism, continued>
>> [b-3:07672] select: init of component lf returned success
>> [b-2:21062] mca: bml: Using self btl for send to [[6866,1],0] on node b-2
>> [b-3:07672] mca: bml: Using self btl for send to [[6866,1],1] on node b-3
>>
>> <Output from the 'mpitest' MPI program: out-of-band-I/O>
>> Hello from b-2
>> The world has 2 nodes
>> My rank is 0
>> Hello from b-3
>>
>> <Output frm OMPI>
>> --------------------------------------------------------------------------
>> At least one pair of MPI processes are unable to reach each other for
>> MPI communications.  This means that no Open MPI device has indicated
>> that it can be used to communicate between these processes.  This is
>> an error; Open MPI requires that all MPI processes be able to reach
>> each other.  This error can sometimes be the result of forgetting to
>> specify the "self" BTL.
>>
>>   Process 1 ([[6866,1],0]) is on host: b-2
>>   Process 2 ([[6866,1],1]) is on host: 10.4.70.12
>>   BTLs attempted: self
>>
>> Your MPI job is now going to abort; sorry.
>> --------------------------------------------------------------------------
>>
>> <Output from the 'mpitest' MPI program: out-of-band-I/O, continued>
>> The world has 2 nodes
>> My rank is 1
>>
>> <Output from OMPI logging mechanism, continued>
>> [b-2:21062] *** An error occurred in MPI_Send
>> [b-2:21062] *** reported by process [140385751007233,21474836480]
>> [b-2:21062] *** on communicator MPI_COMM_WORLD
>> [b-2:21062] *** MPI_ERR_INTERN: internal error
>> [b-2:21062] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will
>> now abort,
>> [b-2:21062] ***    and potentially your MPI job)
>> [durga@b-2 ~]$
>>
>> As you can see, the lf network is not being chosen for communication.
>> Without a modex exchange, how can that happen? Or, in a nutshell, what do I
>> need to do?
>>
>> Thanks a lot in advance
>> Durga
>>
>>
>> 1% of the executables have 99% of CPU privilege!
>> Userspace code! Unite!! Occupy the kernel!!!
>>
>>
>> _______________________________________________
>> devel mailing listde...@open-mpi.org 
>> <javascript:_e(%7B%7D,'cvml','de...@open-mpi.org');>
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/devel/2016/04/18827.php
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <javascript:_e(%7B%7D,'cvml','de...@open-mpi.org');>
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2016/04/18828.php
>>
>
>

Reply via email to