George

Thanks for your help. But what should the progress function return, so that
the event is signalled? Right now I am returning a 1 when data has been
transmitted and 0 otherwise, but that does not seem to work. Also, please
keep in mind that the transport I am working on supports unreliable
datagrams only, so there is no ack from the recipient to wait for.

Thanks again
Durga

The surgeon general advises you to eat right, exercise regularly and quit
ageing.

On Thu, May 5, 2016 at 11:33 PM, George Bosilca <bosi...@icl.utk.edu> wrote:

> Durga,
>
> TCP doesn't need a specialized progress function because we are tied
> directly with libevent. In your case you should provide a BTL progress
> function, function that will be called at the end of libevent base loop
> regularly.
>
>   George.
>
>
> On Thu, May 5, 2016 at 11:30 PM, dpchoudh . <dpcho...@gmail.com> wrote:
>
>> Hi all
>>
>> Apologies for a 101 level question again, but here it is:
>>
>> A new BTL layer I am implementing hangs in MPI_Send(). Please keep in
>> mind that at this stage, I am simply desperate to make MPI data move
>> through this fabric in any way possible, so I have thrown all good
>> programming practice out of the window and in the process might have added
>> bugs.
>>
>> The test code basically has a single call to MPI_Send() with 8 bytes of
>> data, the smallest amount the HCA can DMA. I have a very simple
>> mca_btl_component_progress() method that returns 0 if called before
>> mca_btl_endpoint_send() and returns 1 if called after. I use a static
>> variable to keep track whether endpoint_send() has been called.
>>
>> With this, the MPI process hangs with the following stack:
>>
>> (gdb) bt
>> #0  0x00007f7518c60b7d in poll () from /lib64/libc.so.6
>> #1  0x00007f75183e79f6 in poll_dispatch (base=0x19cf480,
>> tv=0x7f75177efe80) at poll.c:165
>> #2  0x00007f75183df690 in opal_libevent2022_event_base_loop
>> (base=0x19cf480, flags=1) at event.c:1630
>> #3  0x00007f75183613d4 in progress_engine (obj=0x19cedd8) at
>> runtime/opal_progress_threads.c:105
>> #4  0x00007f7518f3ddf5 in start_thread () from /lib64/libpthread.so.0
>> #5  0x00007f7518c6b1ad in clone () from /lib64/libc.so.6
>>
>> I am using code from master branch for this work.
>>
>> Obviously I am not doing the progress handling right, and I don't even
>> understand how it should work, as the TCP btl does not even provide a
>> component progress function.
>>
>> Any relevant pointer on how this should be done is highly appreciated.
>>
>> Thanks
>> Durga
>>
>>
>> The surgeon general advises you to eat right, exercise regularly and quit
>> ageing.
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2016/05/18919.php
>>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/05/18920.php
>

Reply via email to