CM is not being selected for TCP - you specified TCP for the BTLs, but that assumes that a BTL will be selected. You obviously have something in your system that is supported by an MTL, and that will always be selected before a BTL.
> On Apr 28, 2016, at 8:22 PM, dpchoudh . <dpcho...@gmail.com> wrote: > > Hello Gilles > > You are absolutely right: > > 1. Adding --mca pml_base_verbose 100 does show that it is the cm PML that is > being picked by default (even for TCP) > 2. Adding --mca pml ob1 does cause add_procs() and related BTL friends to be > invoked. > > > With a command line of > > mpirun -np 2 -hostfile ~/hostfile -mca btl self,tcp -mca btl_base_verbose > 100 -mca pml_base_verbose 100 ./mpitest > > The output shows (among many other lines) the following: > > [smallMPI:49178] select: init returned priority 30 > [smallMPI:49178] select: initializing pml component ob1 > [smallMPI:49178] select: init returned priority 20 > [smallMPI:49178] select: component v not in the include list > [smallMPI:49178] selected cm best priority 30 > [smallMPI:49178] select: component ob1 not selected / finalized > [smallMPI:49178] select: component cm selected > > Which shows that the cm PML was selected. Replacing 'tcp' above with 'openib' > shows very similar results. (The openib BTL methods are not invoked, either) > > However, I was under the impression that the CM PML can only handle MTLs (and > ob1 can only handle BTLs). So why is cm being selected for TCP? > > Thank you > Durga > > > > The surgeon general advises you to eat right, exercise regularly and quit > ageing. > > On Thu, Apr 28, 2016 at 2:34 AM, Gilles Gouaillardet <gil...@rist.or.jp > <mailto:gil...@rist.or.jp>> wrote: > the add_procs subroutine of the btl should be called. > > /* i added a printf in mca_btl_tcp_add_procs and it *is* invoked */ > > can you try again with --mca pml ob1 --mca pml_base_verbose 100 ? > > maybe the add_procs subroutine is not invoked because openmpi uses cm instead > of ob1 > > > Cheers, > > > Gilles > > On 4/28/2016 3:07 PM, dpchoudh . wrote: >> Hello all >> >> I am struggling with this issue for last few days and thought it would be >> prudent to ask for help from people who have way more experience than I do. >> >> There are two questions, interrelated in my mind, but may not be so in >> reality. Question 2 is the issue I am struggling with, and question 1 sort >> of leads to it. >> >> 1. I see that both in openib and tcp BTL (the two kind of hardware I have >> access to) a modex send happens, but a matching modex receive never happens. >> Is it because of some kind of optimization? (In my case, both IP >> NICs are in the same IP subnet and both IB NICs are in the same IB subnet) >> Or am I not understanding something? How do the processes figure out their >> peer information without a modex receive? >> >> The place in code where the modex receive is called is in btl_add_procs(). >> However, it looks like in both the above BTLs, this method is never called. >> Is that expected? >> >> 2. This is the real question is this: >> I am writing a BTL for a proprietary RDMA NIC (named 'lf' in the code) that >> has no routing capability in protocol, and hence no concept of subnets. An >> HCA simply needs to be plugged in to the switch and it can see the whole >> network. However, there is a VLAN like partition (similar to IB partitions) >> Given this (and as a first cut, every node is in the same partition, so even >> this complexity is eliminated), there is not much use for a modex exchange, >> but I added one anyway just with the partition key. >> >> What I see is that the component open, register and init are all successful, >> but r2 bml still does not choose this network and thus OMPI aborts because >> of lack of full reachability. >> >> This is my command line: >> sudo /usr/local/bin/mpirun --allow-run-as-root -hostfile ~/hostfile -np 2 >> -mca btl self,lf -mca btl_base_verbose 100 -mca bml_base_verbose 100 >> ./mpitest >> >> ('mpitest' is a trivial 'hello world' program plus ONE MPI_Send()/MPI_Recv() >> to test in-band communication. The sudo is required because currently the >> driver requires root permission; I was told that this will be fixed. The >> hostfile has 2 hosts, named b-2 and b-3, with back-to-back connection on >> this 'lf' HCA) >> >> The output of this command is as follows; I have added my comments to >> explain it a bit. >> >> <Output from OMPI logging mechanism> >> [b-2:21062] mca: base: components_register: registering framework bml >> components >> [b-2:21062] mca: base: components_register: found loaded component r2 >> [b-2:21062] mca: base: components_register: component r2 register function >> successful >> [b-2:21062] mca: base: components_open: opening bml components >> [b-2:21062] mca: base: components_open: found loaded component r2 >> [b-2:21062] mca: base: components_open: component r2 open function successful >> [b-2:21062] mca: base: components_register: registering framework btl >> components >> [b-2:21062] mca: base: components_register: found loaded component self >> [b-2:21062] mca: base: components_register: component self register function >> successful >> [b-2:21062] mca: base: components_register: found loaded component lf >> [b-2:21062] mca: base: components_register: component lf register function >> successful >> [b-2:21062] mca: base: components_open: opening btl components >> [b-2:21062] mca: base: components_open: found loaded component self >> [b-2:21062] mca: base: components_open: component self open function >> successful >> [b-2:21062] mca: base: components_open: found loaded component lf >> >> <Debugging output from the HCA driver> >> lf_group_lib.c:442: _lf_open: _lf_open("MPI_0",0x842,0x1b6,4096,0) >> >> <Output from OMPI logging mechanism, continued> >> [b-2:21062] mca: base: components_open: component lf open function successful >> [b-2:21062] select: initializing btl component self >> [b-2:21062] select: init of component self returned success >> [b-2:21062] select: initializing btl component lf >> >> <Debugging output from the HCA driver> >> Created group on b-2 >> >> <Output from OMPI logging mechanism, continued> >> [b-2:21062] select: init of component lf returned success >> [b-3:07672] mca: base: components_register: registering framework bml >> components >> [b-3:07672] mca: base: components_register: found loaded component r2 >> [b-3:07672] mca: base: components_register: component r2 register function >> successful >> [b-3:07672] mca: base: components_open: opening bml components >> [b-3:07672] mca: base: components_open: found loaded component r2 >> [b-3:07672] mca: base: components_open: component r2 open function successful >> [b-3:07672] mca: base: components_register: registering framework btl >> components >> [b-3:07672] mca: base: components_register: found loaded component self >> [b-3:07672] mca: base: components_register: component self register function >> successful >> [b-3:07672] mca: base: components_register: found loaded component lf >> [b-3:07672] mca: base: components_register: component lf register function >> successful >> [b-3:07672] mca: base: components_open: opening btl components >> [b-3:07672] mca: base: components_open: found loaded component self >> [b-3:07672] mca: base: components_open: component self open function >> successful >> [b-3:07672] mca: base: components_open: found loaded component lf >> [b-3:07672] mca: base: components_open: component lf open function successful >> [b-3:07672] select: initializing btl component self >> [b-3:07672] select: init of component self returned success >> [b-3:07672] select: initializing btl component lf >> >> <Debugging output from the HCA driver> >> lf_group_lib.c:442: _lf_open: _lf_open("MPI_0",0x842,0x1b6,4096,0) >> Created group on b-3 >> >> <Output from OMPI logging mechanism, continued> >> [b-3:07672] select: init of component lf returned success >> [b-2:21062] mca: bml: Using self btl for send to [[6866,1],0] on node b-2 >> [b-3:07672] mca: bml: Using self btl for send to [[6866,1],1] on node b-3 >> >> <Output from the 'mpitest' MPI program: out-of-band-I/O> >> Hello from b-2 >> The world has 2 nodes >> My rank is 0 >> Hello from b-3 >> >> <Output frm OMPI> >> -------------------------------------------------------------------------- >> At least one pair of MPI processes are unable to reach each other for >> MPI communications. This means that no Open MPI device has indicated >> that it can be used to communicate between these processes. This is >> an error; Open MPI requires that all MPI processes be able to reach >> each other. This error can sometimes be the result of forgetting to >> specify the "self" BTL. >> >> Process 1 ([[6866,1],0]) is on host: b-2 >> Process 2 ([[6866,1],1]) is on host: 10.4.70.12 >> BTLs attempted: self >> >> Your MPI job is now going to abort; sorry. >> -------------------------------------------------------------------------- >> >> <Output from the 'mpitest' MPI program: out-of-band-I/O, continued> >> The world has 2 nodes >> My rank is 1 >> >> <Output from OMPI logging mechanism, continued> >> [b-2:21062] *** An error occurred in MPI_Send >> [b-2:21062] *** reported by process [140385751007233,21474836480] >> [b-2:21062] *** on communicator MPI_COMM_WORLD >> [b-2:21062] *** MPI_ERR_INTERN: internal error >> [b-2:21062] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will >> now abort, >> [b-2:21062] *** and potentially your MPI job) >> [durga@b-2 ~]$ >> >> As you can see, the lf network is not being chosen for communication. >> Without a modex exchange, how can that happen? Or, in a nutshell, what do I >> need to do? >> >> Thanks a lot in advance >> Durga >> >> >> 1% of the executables have 99% of CPU privilege! >> Userspace code! Unite!! Occupy the kernel!!! >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org <mailto:de...@open-mpi.org> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel >> <https://www.open-mpi.org/mailman/listinfo.cgi/devel> >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/04/18827.php >> <http://www.open-mpi.org/community/lists/devel/2016/04/18827.php> > > _______________________________________________ > devel mailing list > de...@open-mpi.org <mailto:de...@open-mpi.org> > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel > <https://www.open-mpi.org/mailman/listinfo.cgi/devel> > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/04/18828.php > <http://www.open-mpi.org/community/lists/devel/2016/04/18828.php> > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2016/04/18835.php