At first glance, that seems a bit odd...
are you sure you correctly print the reachable bitmap ?
I would suggest you add some instrumentation to understand what happens
(e.g., printf before opal_bitmap_set_bit() and other places that prevent
this from happening)

one more thing ...
now, master default behavior is
mpirun --mca mpi_add_procs_cutoff 0 ...
you might want to try
mpirun --mca mpi_add_procs_cutoff 1024 ...
and see if things make more sense.
if it helps, and iirc, there is a parameter so a btl can report it does not
support cutoff.


Cheers,

Gilles

On Sunday, May 15, 2016, dpchoudh . <dpcho...@gmail.com> wrote:

> Hello Gilles
>
> Thanks for jumping in to help again. Actually, I had already tried some of
> your suggestions before asking for help.
>
> I have several interconnects that can run both openib and tcp BTL. To
> simplify things, I explicitly mentioned TCP:
>
> mpirun -np 2 -hostfile ~/hostfile -mca pml ob1 -mca btl self.tcp ./mpitest
>
> where mpitest is a small program that does MPI_Send()/MPI_Recv() on a
> small string, and then does an MPI_Barrier(). The program does work as
> expected.
>
> I put a printf on the last line of mca_tcp_add_procs() to print the value
> of 'reachable'. What I saw was that the value was always 0 when it was
> invoked for Send()/Recv() and the pointer itself was NULL when invoked for
> Barrier()
>
> Next I looked at pml_ob1_add_procs(), where the call chain starts, and
> found that it initializes and passes an opal_bitmap_t reachable down the
> call chain, but the resulting value is not used later in the code (the
> memory is simply freed later).
>
> That, coupled with the fact that I am trying to imitate what the other BTL
> implementations are doing, yet in mca_bml_r2_endpoint_add_btl() by BTL is
> not being picked up, left me puzzled. Please note that the interconnect
> that I am developing for is on a different cluster (than where I ran the
> above test for TCP BTL.)
>
> Thanks again
> Durga
>
> The surgeon general advises you to eat right, exercise regularly and quit
> ageing.
>
> On Sun, May 15, 2016 at 10:20 AM, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com
> <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote:
>
>> did you check the add_procs callbacks ?
>> (e.g. mca_btl_tcp_add_procs() for the tcp btl)
>> this is where the reachable bitmap is set, and I guess this is what you
>> are looking for.
>>
>> keep in mind that if several btl can be used, the one with the higher
>> exclusivity is used
>> (e.g. tcp is never used if openib is available)
>> you can simply force your btl and self, and the ob1 pml, so you do not
>> have to worry about other btl exclusivity.
>>
>> Cheers,
>>
>> Gilles
>>
>>
>> On Sunday, May 15, 2016, dpchoudh . <dpcho...@gmail.com
>> <javascript:_e(%7B%7D,'cvml','dpcho...@gmail.com');>> wrote:
>>
>>> Hello all
>>>
>>> I have been struggling with this issue for a while and figured it might
>>> be a good idea to ask for help.
>>>
>>> Where (in the code path) is the connectivity map created?
>>>
>>> I can see that it is *used* in mca_bml_r2_endpoint_add_btl(), but
>>> obviously I am not setting it up right, because this routine is not finding
>>> the BTL corresponding to my interconnect.
>>>
>>> Thanks in advance
>>> Durga
>>>
>>> The surgeon general advises you to eat right, exercise regularly and
>>> quit ageing.
>>>
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org <javascript:_e(%7B%7D,'cvml','de...@open-mpi.org');>
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2016/05/18975.php
>>
>
>

Reply via email to