That was my thought exactly. And since the point of the notifier component is to return a *useful* description of what failure the BTL had (like IB ran out of resource X again), that will be lost if we just push that up to the next layer.

Just my $0.02, of course.

Brian

On Thu, 4 Dec 2008, Ralph Castain wrote:

Hmmm...only problem with that idea is that the entity being communicated
to (e.g., SLURM, Moab) have no concept of MPI nor any way to communicate
via that system. They do, however, have APIs that notifier can call, and
know how to speak TCP via their own agreed-upon protocols. And many large
systems turn off the TCP btl (all of ours, for example) because it isn't
needed and opens additional unnecessary ports.
So calling APIs and/or sending messages across the OOB are pretty
straight forward. Teaching Moab to understand btl/datatype engine
messages (flowing across who knows what transport) is an unlikely thing
to happen.

Besides, one of the primary reasons for needing to call notifier is a
failure in the btl - so relying on the btl to send the message is
self-defeating.


On Dec 4, 2008, at 10:37 AM, Richard Graham wrote:

      Here is where I think we should reconsider accessing the
      notifier component in the btl.  It creates dependencies in
      the btl that are not needed.  The idea of a notifier
      component is a good one, but I would defer using it to upper
      layers, rather than embedding it in the guts of the
      communication system.  I would be in favor of an approach
      that sends the information up the call stack.  The btl?s should
      not depend on other communication primitives, as they are the
      communication primitive.

      Rich


      On 12/4/08 9:04 AM, "Ralph Castain" <r...@lanl.gov> wrote:

            Yes, FTB utilizes the notifier framework. In
            addition, we have three
            other components getting ready to be added to
            that framework that will
            provide interfaces to Moab, SLURM, and a DOE
            monitoring program. The
            first two will require messaging capabilities to
            tell the schedulers
            about problem nodes/routes. The latter will also
            use a messaging
            protocol, but is mostly aimed at alerting
            operators to a problem and
            creating a historical archive.

              That said, we can expect the use of
            orte_notifier to spread across
            the BTL's pretty aggressively in the next few
            months, and for the
            notifier API to change/expand as we address these
            needs.

            On Dec 4, 2008, at 6:13 AM, Jeff Squyres wrote:

            > I think you got it right.  And I think we're
            pretty good in terms of
            > BTL usage of ORTE and OPAL (to include the new
            "notifier" service
            > that Ralph put in recently -- what the FTB will
            likely eventually
            > use, I think...?); those interfaces and
            abstraction barriers are
            > technologically enforced.  If you break the
            abstractions, the linker
            > will swiftly and unmercifully punish you.
             (this was exactly [one
            > of] the rationale that we used for splitting
            the code base into
            > OPAL, ORTE, and OMPI several years ago)
            >
            > Greg has already noted on the wiki a few
            constants used in the BTL's
            > that have an OMPI_ prefix that aren't really
            OMPI values (e.g.,
            > OMPI_ENABLE_HETEROGENEOUS_SUPPORT).  These come
            from configure
            > (i.e., opal/include/opal_config.h) and were not
            renamed back when we
            > split the code base into OPAL, ORTE, and OMPI.
             I don't think we had
            > a strong reason for not renaming them -- most
            could probably be
            > renamed to OPAL_* -- we just didn't do it then.
             Perhaps they can be
            > changed during the BTL extraction process (I
            noted this on the wiki).
            >
            >
            >
            > On Dec 3, 2008, at 9:43 PM, Richard Graham
            wrote:
            >
            >> BTW,
            >>  I was guessing FTB is Fault Tolerant
            Backbone, but if not, can
            >> someone tell me what it is ?  If it is not the
            later, what I just
            >> wrote about it makes no sense.
            >>
            >> Rich
            >>
            >>
            >> On 12/3/08 9:34 PM, "Richard Graham"
            <rlgra...@ornl.gov> wrote:
            >>
            >>> The goal is to use the btl?s outside of the
            context of MPI, which
            >>> was what was in mind from the day the ompi
            work started over five
            >>> years ago, but with no other use at the time,
            things grew up
            >>> intermingled ? no surprise at all.  What we are
            attempting to do
            >>> is to untangle the existing dependencies, and
            make a much cleaner
            >>> distinction between how/what data is passed
            between layers.
            >>>
            >>> I expect this will involve some sort of well
            defined interface
            >>> between the btl?s and orte, and I don?t know if
            this will also
            >>> require something like this between the btl?s
            and the pml ? I
            >>> think that interface is rigidly enforced, but
            am not sure.
            >>>
            >>> I expect that explicit calls to FTB in the
            btl layer would have to
            >>> be componentized, especially in the context
            of what is developing
            >>> in the FT working group of the MPI Forum.
             Not that FTB is bad in
            >>> any way, just that it is one of many
            monitors.
            >>>
            >>> We will need to talk about this on a case by
            case basis, and
            >>> decide how to proceed.  If anyone wants to
            help, please do.
            >>>
            >>> Rich
            >>>
            >>>
            >>> On 12/3/08 3:02 PM, "Ralph Castain"
            <r...@lanl.gov> wrote:
            >>>
            >>>> I managed to execute the modex-less changes
            pretty much without
            >>>> introducing additional ORTE dependencies
            into the BTL's, though
            >>>> there
            >>>> may be some additions as we look a the other
            BTLs that I didn't
            >>>> address. So hopefully that won't contribute
            too much to the issue
            >>>> here.
            >>>>
            >>>> At the moment, I don't think it matters
            where notifier sits - it
            >>>> might
            >>>> be able to move to OPAL. Only catch will be
            if some notifier
            >>>> component
            >>>> requires communications. I'm thinking of
            FTB, for example, and
            >>>> our own
            >>>> local monitoring program that may require
            TCP messaging. We don't
            >>>> currently have anything in OPAL that would
            support an OPAL level
            >>>> messaging system, though perhaps that could
            be resolved.
            >>>>
            >>>> We also have dependencies where the BTL's
            will call orte_ess to
            >>>> find
            >>>> out what node another proc is on, the node
            local rank of that proc,
            >>>> etc. Those dependencies are likely to grow
            after the Dec meeting
            >>>> (see
            >>>> wiki for that agenda item), and definitely
            cannot be moved to OPAL.
            >>>>
            >>>> However, note that Rich stated the BTL's
            were -not- moving to OPAL.
            >>>> This begs the question: where -are- they
            going? Into their own
            >>>> layer?
            >>>> Will that layer be somewhere in-between OMPI
            and ORTE (in which
            >>>> case,
            >>>> the ORTE dependencies are moot)?
            >>>>
            >>>> I note that the wiki page doesn't address
            any of these questions,
            >>>> which is understandable if things are just
            getting underway. But it
            >>>> does sound like this is going to take some
            thought to ensure we
            >>>> don't
            >>>> paint ourselves into a corner.
            >>>>
            >>>> Ralph
            >>>>
            >>>>
            >>>> On Dec 3, 2008, at 12:10 PM, Jeff Squyres
            wrote:
            >>>>
            >>>> > FWIW, I see lots of notifier calls being
            added to the BTLs (and
            >>>> > elsewhere throughout the OMPI code base)
            over time...
            >>>> >
            >>>> > On Dec 3, 2008, at 2:07 PM, Tim Mattox
            wrote:
            >>>> >
            >>>> >> The BTLs might have added calls to the
            notifier framework in
            >>>> their
            >>>> >> error paths.
            >>>> >> The notifier framework is currently in
            the ORTE layer... not
            >>>> sure
            >>>> >> if we could
            >>>> >> move it down to OPAL.  Ralph, any
            thoughts on that?
            >>>> >>
            >>>> >> On Wed, Dec 3, 2008 at 11:56 AM, Richard
            Graham <rlgra...@ornl.gov
            >>>> >
            >>>> >> wrote:
            >>>> >>> George told me about what he is doing,
            so no changes would be
            >>>> >>> committed
            >>>> >>> until George has his changes in.
            >>>> >>>
            >>>> >>> Are there other changes to the btl's
            that we should be aware
            >>>> of ?
            >>>> >>>
            >>>> >>> Rich
            >>>> >>>
            >>>> >>>
            >>>> >>> On 12/3/08 11:47 AM, "George Bosilca"
            <bosi...@eecs.utk.edu>
            >>>> wrote:
            >>>> >>>
            >>>> >>>> Terry,
            >>>> >>>>
            >>>> >>>> I'm involved [at some degree] in both
            efforts and I can
            >>>> confirm
            >>>> >>>> these
            >>>> >>>> two efforts will not affect each other
            in any bad way.
            >>>> >>>>
            >>>> >>>>  george.
            >>>> >>>>
            >>>> >>>> On Dec 3, 2008, at 11:42 , Terry Dontje
            wrote:
            >>>> >>>>
            >>>> >>>>> I don't have any *strong* objections.
            However, I know that
            >>>> Eugene
            >>>> >>>>> and George B have been working on some
            Fastpath code changes
            >>>> >>>>> that we
            >>>> >>>>> should make sure neither project
            obliterates the other.
            >>>> >>>>>
            >>>> >>>>> --td
            >>>> >>>>>
            >>>> >>>>> Richard Graham wrote:
            >>>> >>>>>> Now that 1.3 will be released, we
            would like to go ahead
            >>>> with the
            >>>> >>>>>> plan to move the btl?s out of the MPI
            layer. Greg Koenig
            >>>> who is
            >>>> >>>>>> doing most of the work has started a
            wiki page with
            >>>> details on
            >>>> >>>>>> the
            >>>> >>>>>> plans. Right now details are sketchy,
            as Greg is digging
            >>>> through
            >>>> >>>>>> the code, and has only hand written
            notes on data
            >>>> structures that
            >>>> >>>>>> need to be moved, include files that
            are not needed, etc.
            >>>> The
            >>>> >>>>>> page
            >>>> >>>>>> is at:
            >>>> >>>>>>
            _https://svn.open-mpi.org/trac/ompi/wiki/BTLExtraction_
            >>>> >>>>>>
            >>>> >>>>>> The first three steps basically only
            involve code motion,
            >>>> moving
            >>>> >>>>>> items such as ompi_list, and renaming
            them, moving where
            >>>> the code
            >>>> >>>>>> is actually located in the
            repository, and the like. For
            >>>> these we
            >>>> >>>>>> do not plan to put out a formal RFC,
            but comments are very
            >>>> >>>>>> welcome,
            >>>> >>>>>> and any hands that are willing to
            help with this are even
            >>>> more
            >>>> >>>>>> welcome.
            >>>> >>>>>>
            >>>> >>>>>> The last phase where the btl?s are made
            dependent on OPAL,
            >>>> and
            >>>> >>>>>> supporting libraries such as mpools I
            expect will be
            >>>> disruptive,
            >>>> >>>>>> and will definitely require an RFC,
            and will also be a
            >>>> longer
            >>>> >>>>>> process.
            >>>> >>>>>>
            >>>> >>>>>> Please send comments,
            >>>> >>>>>> Rich
            >>>> >>>>>>
            >>>>
            
------------------------------------------------------------------------
            >>>> >>>>>>
            >>>> >>>>>>
            _______________________________________________
            >>>> >>>>>> devel mailing list
            >>>> >>>>>> de...@open-mpi.org
            >>>> >>>>>>
            http://www.open-mpi.org/mailman/listinfo.cgi/devel
            >>>> >>>>>>
            >>>> >>>>>
            >>>> >>>>>
            _______________________________________________
            >>>> >>>>> devel mailing list
            >>>> >>>>> de...@open-mpi.org
            >>>> >>>>>
            http://www.open-mpi.org/mailman/listinfo.cgi/devel
            >>>> >>>>
            >>>> >>>>
            >>>> >>>>
            _______________________________________________
            >>>> >>>> devel mailing list
            >>>> >>>> de...@open-mpi.org
            >>>> >>>>
            http://www.open-mpi.org/mailman/listinfo.cgi/devel
            >>>> >>>
            >>>> >>>
            >>>> >>>
            _______________________________________________
            >>>> >>> devel mailing list
            >>>> >>> de...@open-mpi.org
            >>>> >>>
            http://www.open-mpi.org/mailman/listinfo.cgi/devel
            >>>> >>>
            >>>> >>
            >>>> >>
            >>>> >>
            >>>> >> --
            >>>> >> Tim Mattox, Ph.D. -
            http://homepage.mac.com/tmattox/
            >>>> >> tmat...@gmail.com ||
            timat...@open-mpi.org
            >>>> >>   I'm a bright...
            http://www.the-brights.net/
            >>>> >>
            >>>> >>
            _______________________________________________
            >>>> >> devel mailing list
            >>>> >> de...@open-mpi.org
            >>>> >>
            http://www.open-mpi.org/mailman/listinfo.cgi/devel
            >>>> >
            >>>> >
            >>>> > --
            >>>> > Jeff Squyres
            >>>> > Cisco Systems
            >>>> >
            >>>> >
            >>>> >
            _______________________________________________
            >>>> > devel mailing list
            >>>> > de...@open-mpi.org
            >>>> >
            http://www.open-mpi.org/mailman/listinfo.cgi/devel
            >>>>
            >>>>
            >>>>
            _______________________________________________
            >>>> devel mailing list
            >>>> de...@open-mpi.org
            >>>>
            http://www.open-mpi.org/mailman/listinfo.cgi/devel
            >>>>
            >>>
            >>>
            _______________________________________________
            >>> devel mailing list
            >>> de...@open-mpi.org
            >>>
            http://www.open-mpi.org/mailman/listinfo.cgi/devel
            >>
            _______________________________________________
            >> devel mailing list
            >> de...@open-mpi.org
            >>
            http://www.open-mpi.org/mailman/listinfo.cgi/devel
            >
            >
            > --
            > Jeff Squyres
            > Cisco Systems
            >
            >
            > _______________________________________________
            > devel mailing list
            > de...@open-mpi.org
            >
            http://www.open-mpi.org/mailman/listinfo.cgi/devel


            _______________________________________________
            devel mailing list
            de...@open-mpi.org
            http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



Reply via email to