Thanks Pasha!! On Nov 14, 2013, at 4:34 PM, Shamis, Pavel <sham...@ornl.gov> wrote:
> For Iboffload this should not be an issue since our connection manager is > blocking (I have to double-check ) > > For openib, this should not be such huge change. The code is pretty much > standalone, we only have to move it to > main thread and add signaling mechanism. > > I will take a look. > > Best, > -Pasha > > > > > On Nov 14, 2013, at 7:25 PM, Ralph Castain <r...@open-mpi.org> wrote: > >> >> On Nov 14, 2013, at 4:22 PM, Shamis, Pavel <sham...@ornl.gov> wrote: >> >>> Well, this is major change in a behavior. >>> >>> Since openib calls communication calls from the callback >>> it pretty much requires to enable thread safety on openib btl level. >> >> Ah, yes - could well be true. Or else separate the two like we do elsewhere >> - transfer the recv callback to the openib thread and let it do the rest. >> >>> >>> But we may move the queue flush operation from the callback to main thread, >>> so >>> the progress engine will wait on a signal from callback. >> >> Yep - that's what we do elsewhere >> >>> >>> How does it work for other parts of OMPI (sm, communicator) ? >>> I guess they don't do anything in the callbacks ? >> >> Correct - they immediately transfer the info to their local progress engine >> (in whatever form). >> >>> >>> Best, >>> Pasha >>> >>> On Nov 14, 2013, at 6:35 PM, Ralph Castain <r...@open-mpi.org> wrote: >>> >>>> >>>> On Nov 14, 2013, at 3:33 PM, Shamis, Pavel <sham...@ornl.gov> wrote: >>>> >>>>> >>>>>> The only change is that the receive callback is now occurring in the >>>>>> ORTE event thread, and so perhaps someone needs to look at a way to pass >>>>>> that back into the OMPI event base (which I guess is the OPAL event >>>>>> base)? Just glancing at the code, it looks like that could be the issue >>>>>> - but I honestly have no idea what event base someone wants to switch >>>>>> to, or if they want to resolve it some other way. There are clearly some >>>>>> things happening in the ofacm oob code that involve thread locking etc., >>>>>> but I don't know what those areas are trying to do. >>>>> >>>>> I see. In this mode do you enable thread safety support in all library >>>>> (mpi)? >>>> >>>> Only if the user configures to do so - ORTE doesn't require it as we use >>>> the event library's thread safety and do everything inside events. >>>> >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel