At first glance, that seems a bit odd... are you sure you correctly print the reachable bitmap ? I would suggest you add some instrumentation to understand what happens (e.g., printf before opal_bitmap_set_bit() and other places that prevent this from happening)
one more thing ... now, master default behavior is mpirun --mca mpi_add_procs_cutoff 0 ... you might want to try mpirun --mca mpi_add_procs_cutoff 1024 ... and see if things make more sense. if it helps, and iirc, there is a parameter so a btl can report it does not support cutoff. Cheers, Gilles On Sunday, May 15, 2016, dpchoudh . <dpcho...@gmail.com> wrote: > Hello Gilles > > Thanks for jumping in to help again. Actually, I had already tried some of > your suggestions before asking for help. > > I have several interconnects that can run both openib and tcp BTL. To > simplify things, I explicitly mentioned TCP: > > mpirun -np 2 -hostfile ~/hostfile -mca pml ob1 -mca btl self.tcp ./mpitest > > where mpitest is a small program that does MPI_Send()/MPI_Recv() on a > small string, and then does an MPI_Barrier(). The program does work as > expected. > > I put a printf on the last line of mca_tcp_add_procs() to print the value > of 'reachable'. What I saw was that the value was always 0 when it was > invoked for Send()/Recv() and the pointer itself was NULL when invoked for > Barrier() > > Next I looked at pml_ob1_add_procs(), where the call chain starts, and > found that it initializes and passes an opal_bitmap_t reachable down the > call chain, but the resulting value is not used later in the code (the > memory is simply freed later). > > That, coupled with the fact that I am trying to imitate what the other BTL > implementations are doing, yet in mca_bml_r2_endpoint_add_btl() by BTL is > not being picked up, left me puzzled. Please note that the interconnect > that I am developing for is on a different cluster (than where I ran the > above test for TCP BTL.) > > Thanks again > Durga > > The surgeon general advises you to eat right, exercise regularly and quit > ageing. > > On Sun, May 15, 2016 at 10:20 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com > <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote: > >> did you check the add_procs callbacks ? >> (e.g. mca_btl_tcp_add_procs() for the tcp btl) >> this is where the reachable bitmap is set, and I guess this is what you >> are looking for. >> >> keep in mind that if several btl can be used, the one with the higher >> exclusivity is used >> (e.g. tcp is never used if openib is available) >> you can simply force your btl and self, and the ob1 pml, so you do not >> have to worry about other btl exclusivity. >> >> Cheers, >> >> Gilles >> >> >> On Sunday, May 15, 2016, dpchoudh . <dpcho...@gmail.com >> <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>> wrote: >> >>> Hello all >>> >>> I have been struggling with this issue for a while and figured it might >>> be a good idea to ask for help. >>> >>> Where (in the code path) is the connectivity map created? >>> >>> I can see that it is *used* in mca_bml_r2_endpoint_add_btl(), but >>> obviously I am not setting it up right, because this routine is not finding >>> the BTL corresponding to my interconnect. >>> >>> Thanks in advance >>> Durga >>> >>> The surgeon general advises you to eat right, exercise regularly and >>> quit ageing. >>> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org <javascript:_e(%7B%7D,'cvml','de...@open-mpi.org');> >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/05/18975.php >> > >