my basic understanding is that ob1 works with btl, and cm works with mtl (please someone corrects me if I am wrong) an other way to put this is cm cannot use the tcp btl.
so I can only guess one mtl (PSM ?) is available, and so cm is preferred over ob1. what if you mpirun --mca mtl ^psm ... is cm selected over ob1 ? note PSM does not disqualify itself if there is no link, and this is now being investigated at intel. Cheers, Gilles On Friday, April 29, 2016, dpchoudh . <dpcho...@gmail.com> wrote: > Hello Gilles > > You are absolutely right: > > 1. Adding --mca pml_base_verbose 100 does show that it is the cm PML that > is being picked by default (even for TCP) > 2. Adding --mca pml ob1 does cause add_procs() and related BTL friends to > be invoked. > > > With a command line of > > mpirun -np 2 -hostfile ~/hostfile -mca btl self,tcp -mca btl_base_verbose > 100 -mca pml_base_verbose 100 ./mpitest > > The output shows (among many other lines) the following: > > [smallMPI:49178] select: init returned priority 30 > [smallMPI:49178] select: initializing pml component ob1 > [smallMPI:49178] select: init returned priority 20 > [smallMPI:49178] select: component v not in the include list > [smallMPI:49178] selected cm best priority 30 > > *[smallMPI:49178] select: component ob1 not selected / > finalized[smallMPI:49178] select: component cm selected* > > Which shows that the cm PML was selected. Replacing 'tcp' above with > 'openib' shows very similar results. (The openib BTL methods are not > invoked, either) > > However, I was under the impression that the CM PML can only handle MTLs > (and ob1 can only handle BTLs). So why is cm being selected for TCP? > > Thank you > Durga > > > > The surgeon general advises you to eat right, exercise regularly and quit > ageing. > > On Thu, Apr 28, 2016 at 2:34 AM, Gilles Gouaillardet <gil...@rist.or.jp > <javascript:_e(%7B%7D,'cvml','gil...@rist.or.jp');>> wrote: > >> the add_procs subroutine of the btl should be called. >> >> /* i added a printf in mca_btl_tcp_add_procs and it *is* invoked */ >> >> can you try again with --mca pml ob1 --mca pml_base_verbose 100 ? >> >> maybe the add_procs subroutine is not invoked because openmpi uses cm >> instead of ob1 >> >> >> Cheers, >> >> >> Gilles >> >> On 4/28/2016 3:07 PM, dpchoudh . wrote: >> >> Hello all >> >> I am struggling with this issue for last few days and thought it would be >> prudent to ask for help from people who have way more experience than I do. >> >> There are two questions, interrelated in my mind, but may not be so in >> reality. Question 2 is the issue I am struggling with, and question 1 sort >> of leads to it. >> >> 1. I see that both in openib and tcp BTL (the two kind of hardware I have >> access to) a modex send happens, but a matching modex receive never >> happens. Is it because of some kind of optimization? (In my case, both IP >> NICs are in the same IP subnet and both IB NICs are in the same IB subnet) >> Or am I not understanding something? How do the processes figure out their >> peer information without a modex receive? >> >> The place in code where the modex receive is called is in >> btl_add_procs(). However, it looks like in both the above BTLs, this method >> is never called. Is that expected? >> >> 2. This is the real question is this: >> I am writing a BTL for a proprietary RDMA NIC (named 'lf' in the code) >> that has no routing capability in protocol, and hence no concept of >> subnets. An HCA simply needs to be plugged in to the switch and it can see >> the whole network. However, there is a VLAN like partition (similar to IB >> partitions) >> Given this (and as a first cut, every node is in the same partition, so >> even this complexity is eliminated), there is not much use for a modex >> exchange, but I added one anyway just with the partition key. >> >> What I see is that the component open, register and init are all >> successful, but r2 bml still does not choose this network and thus OMPI >> aborts because of lack of full reachability. >> >> This is my command line: >> sudo /usr/local/bin/mpirun --allow-run-as-root -hostfile ~/hostfile -np 2 >> -mca btl self,lf -mca btl_base_verbose 100 -mca bml_base_verbose 100 >> ./mpitest >> >> ('mpitest' is a trivial 'hello world' program plus ONE >> MPI_Send()/MPI_Recv() to test in-band communication. The sudo is required >> because currently the driver requires root permission; I was told that this >> will be fixed. The hostfile has 2 hosts, named b-2 and b-3, with >> back-to-back connection on this 'lf' HCA) >> >> The output of this command is as follows; I have added my comments to >> explain it a bit. >> >> <Output from OMPI logging mechanism> >> [b-2:21062] mca: base: components_register: registering framework bml >> components >> [b-2:21062] mca: base: components_register: found loaded component r2 >> [b-2:21062] mca: base: components_register: component r2 register >> function successful >> [b-2:21062] mca: base: components_open: opening bml components >> [b-2:21062] mca: base: components_open: found loaded component r2 >> [b-2:21062] mca: base: components_open: component r2 open function >> successful >> [b-2:21062] mca: base: components_register: registering framework btl >> components >> [b-2:21062] mca: base: components_register: found loaded component self >> [b-2:21062] mca: base: components_register: component self register >> function successful >> [b-2:21062] mca: base: components_register: found loaded component lf >> [b-2:21062] mca: base: components_register: component lf register >> function successful >> [b-2:21062] mca: base: components_open: opening btl components >> [b-2:21062] mca: base: components_open: found loaded component self >> [b-2:21062] mca: base: components_open: component self open function >> successful >> [b-2:21062] mca: base: components_open: found loaded component lf >> >> <Debugging output from the HCA driver> >> lf_group_lib.c:442: _lf_open: _lf_open("MPI_0",0x842,0x1b6,4096,0) >> >> <Output from OMPI logging mechanism, continued> >> [b-2:21062] mca: base: components_open: component lf open function >> successful >> [b-2:21062] select: initializing btl component self >> [b-2:21062] select: init of component self returned success >> [b-2:21062] select: initializing btl component lf >> >> <Debugging output from the HCA driver> >> Created group on b-2 >> >> <Output from OMPI logging mechanism, continued> >> [b-2:21062] select: init of component lf returned success >> [b-3:07672] mca: base: components_register: registering framework bml >> components >> [b-3:07672] mca: base: components_register: found loaded component r2 >> [b-3:07672] mca: base: components_register: component r2 register >> function successful >> [b-3:07672] mca: base: components_open: opening bml components >> [b-3:07672] mca: base: components_open: found loaded component r2 >> [b-3:07672] mca: base: components_open: component r2 open function >> successful >> [b-3:07672] mca: base: components_register: registering framework btl >> components >> [b-3:07672] mca: base: components_register: found loaded component self >> [b-3:07672] mca: base: components_register: component self register >> function successful >> [b-3:07672] mca: base: components_register: found loaded component lf >> [b-3:07672] mca: base: components_register: component lf register >> function successful >> [b-3:07672] mca: base: components_open: opening btl components >> [b-3:07672] mca: base: components_open: found loaded component self >> [b-3:07672] mca: base: components_open: component self open function >> successful >> [b-3:07672] mca: base: components_open: found loaded component lf >> [b-3:07672] mca: base: components_open: component lf open function >> successful >> [b-3:07672] select: initializing btl component self >> [b-3:07672] select: init of component self returned success >> [b-3:07672] select: initializing btl component lf >> >> <Debugging output from the HCA driver> >> lf_group_lib.c:442: _lf_open: _lf_open("MPI_0",0x842,0x1b6,4096,0) >> Created group on b-3 >> >> <Output from OMPI logging mechanism, continued> >> [b-3:07672] select: init of component lf returned success >> [b-2:21062] mca: bml: Using self btl for send to [[6866,1],0] on node b-2 >> [b-3:07672] mca: bml: Using self btl for send to [[6866,1],1] on node b-3 >> >> <Output from the 'mpitest' MPI program: out-of-band-I/O> >> Hello from b-2 >> The world has 2 nodes >> My rank is 0 >> Hello from b-3 >> >> <Output frm OMPI> >> -------------------------------------------------------------------------- >> At least one pair of MPI processes are unable to reach each other for >> MPI communications. This means that no Open MPI device has indicated >> that it can be used to communicate between these processes. This is >> an error; Open MPI requires that all MPI processes be able to reach >> each other. This error can sometimes be the result of forgetting to >> specify the "self" BTL. >> >> Process 1 ([[6866,1],0]) is on host: b-2 >> Process 2 ([[6866,1],1]) is on host: 10.4.70.12 >> BTLs attempted: self >> >> Your MPI job is now going to abort; sorry. >> -------------------------------------------------------------------------- >> >> <Output from the 'mpitest' MPI program: out-of-band-I/O, continued> >> The world has 2 nodes >> My rank is 1 >> >> <Output from OMPI logging mechanism, continued> >> [b-2:21062] *** An error occurred in MPI_Send >> [b-2:21062] *** reported by process [140385751007233,21474836480] >> [b-2:21062] *** on communicator MPI_COMM_WORLD >> [b-2:21062] *** MPI_ERR_INTERN: internal error >> [b-2:21062] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will >> now abort, >> [b-2:21062] *** and potentially your MPI job) >> [durga@b-2 ~]$ >> >> As you can see, the lf network is not being chosen for communication. >> Without a modex exchange, how can that happen? Or, in a nutshell, what do I >> need to do? >> >> Thanks a lot in advance >> Durga >> >> >> 1% of the executables have 99% of CPU privilege! >> Userspace code! Unite!! Occupy the kernel!!! >> >> >> _______________________________________________ >> devel mailing listde...@open-mpi.org >> <javascript:_e(%7B%7D,'cvml','de...@open-mpi.org');> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/04/18827.php >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org <javascript:_e(%7B%7D,'cvml','de...@open-mpi.org');> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/04/18828.php >> > >