Thanks Pasha!!

On Nov 14, 2013, at 4:34 PM, Shamis, Pavel <sham...@ornl.gov> wrote:

> For Iboffload this should not be an issue since our connection manager is 
> blocking (I have to double-check )
> 
> For openib, this should not be such huge change. The code is pretty much 
> standalone, we only have to move it to 
> main thread and add signaling mechanism.
> 
> I will take a look.
> 
> Best,
> -Pasha
> 
> 
> 
> 
> On Nov 14, 2013, at 7:25 PM, Ralph Castain <r...@open-mpi.org> wrote:
> 
>> 
>> On Nov 14, 2013, at 4:22 PM, Shamis, Pavel <sham...@ornl.gov> wrote:
>> 
>>> Well, this is major change in a behavior.
>>> 
>>> Since openib calls communication calls from the callback
>>> it pretty much requires to enable thread safety on openib btl level.
>> 
>> Ah, yes - could well be true. Or else separate the two like we do elsewhere 
>> - transfer the recv callback to the openib thread and let it do the rest.
>> 
>>> 
>>> But we may move the queue flush operation from the callback to main thread, 
>>> so 
>>> the progress engine will wait on a signal from callback. 
>> 
>> Yep - that's what we do elsewhere
>> 
>>> 
>>> How does it work for other parts of OMPI (sm, communicator) ? 
>>> I guess they don't do anything in the callbacks ? 
>> 
>> Correct - they immediately transfer the info to their local progress engine 
>> (in whatever form).
>> 
>>> 
>>> Best,
>>> Pasha
>>> 
>>> On Nov 14, 2013, at 6:35 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>> 
>>>> 
>>>> On Nov 14, 2013, at 3:33 PM, Shamis, Pavel <sham...@ornl.gov> wrote:
>>>> 
>>>>> 
>>>>>> The only change is that the receive callback is now occurring in the 
>>>>>> ORTE event thread, and so perhaps someone needs to look at a way to pass 
>>>>>> that back into the OMPI event base (which I guess is the OPAL event 
>>>>>> base)? Just glancing at the code, it looks like that could be the issue 
>>>>>> - but I honestly have no idea what event base someone wants to switch 
>>>>>> to, or if they want to resolve it some other way. There are clearly some 
>>>>>> things happening in the ofacm oob code that involve thread locking etc., 
>>>>>> but I don't know what those areas are trying to do.
>>>>> 
>>>>> I see. In this mode do you enable thread safety support  in all library 
>>>>> (mpi)?
>>>> 
>>>> Only if the user configures to do so - ORTE doesn't require it as we use 
>>>> the event library's thread safety and do everything inside events.
>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> de...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> 
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to