BTW,
  I was guessing FTB is Fault Tolerant Backbone, but if not, can someone
tell me what it is ?  If it is not the later, what I just wrote about it
makes no sense.

Rich


On 12/3/08 9:34 PM, "Richard Graham" <rlgra...@ornl.gov> wrote:

> The goal is to use the btl¹s outside of the context of MPI, which was what was
> in mind from the day the ompi work started over five years ago, but with no
> other use at the time, things grew up intermingled ­ no surprise at all.  What
> we are attempting to do is to untangle the existing dependencies, and make a
> much cleaner distinction between how/what data is passed between layers.
> 
> I expect this will involve some sort of well defined interface between the
> btl¹s and orte, and I don¹t know if this will also require something like this
> between the btl¹s and the pml ­ I think that interface is rigidly enforced,
> but am not sure.
> 
> I expect that explicit calls to FTB in the btl layer would have to be
> componentized, especially in the context of what is developing in the FT
> working group of the MPI Forum.  Not that FTB is bad in any way, just that it
> is one of many monitors.
> 
> We will need to talk about this on a case by case basis, and decide how to
> proceed.  If anyone wants to help, please do.
> 
> Rich
> 
> 
> On 12/3/08 3:02 PM, "Ralph Castain" <r...@lanl.gov> wrote:
> 
>> I managed to execute the modex-less changes pretty much without
>> introducing additional ORTE dependencies into the BTL's, though there
>> may be some additions as we look a the other BTLs that I didn't
>> address. So hopefully that won't contribute too much to the issue here.
>> 
>> At the moment, I don't think it matters where notifier sits - it might
>> be able to move to OPAL. Only catch will be if some notifier component
>> requires communications. I'm thinking of FTB, for example, and our own
>> local monitoring program that may require TCP messaging. We don't
>> currently have anything in OPAL that would support an OPAL level
>> messaging system, though perhaps that could be resolved.
>> 
>> We also have dependencies where the BTL's will call orte_ess to find
>> out what node another proc is on, the node local rank of that proc,
>> etc. Those dependencies are likely to grow after the Dec meeting (see
>> wiki for that agenda item), and definitely cannot be moved to OPAL.
>> 
>> However, note that Rich stated the BTL's were -not- moving to OPAL.
>> This begs the question: where -are- they going? Into their own layer?
>> Will that layer be somewhere in-between OMPI and ORTE (in which case,
>> the ORTE dependencies are moot)?
>> 
>> I note that the wiki page doesn't address any of these questions,
>> which is understandable if things are just getting underway. But it
>> does sound like this is going to take some thought to ensure we don't
>> paint ourselves into a corner.
>> 
>> Ralph
>> 
>> 
>> On Dec 3, 2008, at 12:10 PM, Jeff Squyres wrote:
>> 
>>> > FWIW, I see lots of notifier calls being added to the BTLs (and
>>> > elsewhere throughout the OMPI code base) over time...
>>> >
>>> > On Dec 3, 2008, at 2:07 PM, Tim Mattox wrote:
>>> >
>>>> >> The BTLs might have added calls to the notifier framework in their
>>>> >> error paths.
>>>> >> The notifier framework is currently in the ORTE layer... not sure
>>>> >> if we could
>>>> >> move it down to OPAL.  Ralph, any thoughts on that?
>>>> >>
>>>> >> On Wed, Dec 3, 2008 at 11:56 AM, Richard Graham <rlgra...@ornl.gov>
>>>> >> wrote:
>>>>> >>> George told me about what he is doing, so no changes would be
>>>>> >>> committed
>>>>> >>> until George has his changes in.
>>>>> >>>
>>>>> >>> Are there other changes to the btl's that we should be aware of ?
>>>>> >>>
>>>>> >>> Rich
>>>>> >>>
>>>>> >>>
>>>>> >>> On 12/3/08 11:47 AM, "George Bosilca" <bosi...@eecs.utk.edu> wrote:
>>>>> >>>
>>>>>> >>>> Terry,
>>>>>> >>>>
>>>>>> >>>> I'm involved [at some degree] in both efforts and I can confirm
>>>>>> >>>> these
>>>>>> >>>> two efforts will not affect each other in any bad way.
>>>>>> >>>>
>>>>>> >>>>  george.
>>>>>> >>>>
>>>>>> >>>> On Dec 3, 2008, at 11:42 , Terry Dontje wrote:
>>>>>> >>>>
>>>>>>> >>>>> I don't have any *strong* objections. However, I know that Eugene
>>>>>>> >>>>> and George B have been working on some Fastpath code changes
>>>>>>> >>>>> that we
>>>>>>> >>>>> should make sure neither project obliterates the other.
>>>>>>> >>>>>
>>>>>>> >>>>> --td
>>>>>>> >>>>>
>>>>>>> >>>>> Richard Graham wrote:
>>>>>>>> >>>>>> Now that 1.3 will be released, we would like to go ahead with
the
>>>>>>>> >>>>>> plan to move the btl¹s out of the MPI layer. Greg Koenig who is
>>>>>>>> >>>>>> doing most of the work has started a wiki page with details on
>>>>>>>> >>>>>> the
>>>>>>>> >>>>>> plans. Right now details are sketchy, as Greg is digging through
>>>>>>>> >>>>>> the code, and has only hand written notes on data structures
that
>>>>>>>> >>>>>> need to be moved, include files that are not needed, etc. The
>>>>>>>> >>>>>> page
>>>>>>>> >>>>>> is at:
>>>>>>>> >>>>>> _https://svn.open-mpi.org/trac/ompi/wiki/BTLExtraction_
>>>>>>>> >>>>>>
>>>>>>>> >>>>>> The first three steps basically only involve code motion, moving
>>>>>>>> >>>>>> items such as ompi_list, and renaming them, moving where the
code
>>>>>>>> >>>>>> is actually located in the repository, and the like. For these
we
>>>>>>>> >>>>>> do not plan to put out a formal RFC, but comments are very
>>>>>>>> >>>>>> welcome,
>>>>>>>> >>>>>> and any hands that are willing to help with this are even more
>>>>>>>> >>>>>> welcome.
>>>>>>>> >>>>>>
>>>>>>>> >>>>>> The last phase where the btl¹s are made dependent on OPAL, and
>>>>>>>> >>>>>> supporting libraries such as mpools I expect will be disruptive,
>>>>>>>> >>>>>> and will definitely require an RFC, and will also be a longer
>>>>>>>> >>>>>> process.
>>>>>>>> >>>>>>
>>>>>>>> >>>>>> Please send comments,
>>>>>>>> >>>>>> Rich
>>>>>>>> >>>>>> 
>>>>>>>> 
------------------------------------------------------------------------
>>>>>>>> >>>>>>
>>>>>>>> >>>>>> _______________________________________________
>>>>>>>> >>>>>> devel mailing list
>>>>>>>> >>>>>> de...@open-mpi.org
>>>>>>>> >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>>> >>>>>>
>>>>>>> >>>>>
>>>>>>> >>>>> _______________________________________________
>>>>>>> >>>>> devel mailing list
>>>>>>> >>>>> de...@open-mpi.org
>>>>>>> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>> >>>>
>>>>>> >>>>
>>>>>> >>>> _______________________________________________
>>>>>> >>>> devel mailing list
>>>>>> >>>> de...@open-mpi.org
>>>>>> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> >>>
>>>>> >>>
>>>>> >>> _______________________________________________
>>>>> >>> devel mailing list
>>>>> >>> de...@open-mpi.org
>>>>> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> >>>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --
>>>> >> Tim Mattox, Ph.D. - http://homepage.mac.com/tmattox/
>>>> >> tmat...@gmail.com || timat...@open-mpi.org
>>>> >>   I'm a bright... http://www.the-brights.net/
>>>> >>
>>>> >> _______________________________________________
>>>> >> devel mailing list
>>>> >> de...@open-mpi.org
>>>> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> >
>>> >
>>> > --
>>> > Jeff Squyres
>>> > Cisco Systems
>>> >
>>> >
>>> > _______________________________________________
>>> > devel mailing list
>>> > de...@open-mpi.org
>>> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to