Re: [OMPI devel] Specifying networks/APIs for OMPI (was: topic for agenda)

Paul Hargrove Tue, 20 Oct 2015 16:35:52 -0400 (EDT)

I looked quickly over the quoted emails and didn't see something I had
hoped/expected to.


In addition to the "dimensions" of type, api and pml I think users may also
be concerned about the "port" dimension (or device if you prefer).
So, it might be worth including that in the discussion of the
high-level-thing-for-end-users.

As an example, I might have two ethernet cards, one of which is a Cisco
VNIC.
I would want be able to control which BTL or MTL is used on those NICs
independently, including the option to disable use of one or the other.
I do not want to learn distinct include/exclude MCA params for every BTL
and MTL to accomplish that.

-Paul

On Tue, Oct 20, 2015 at 12:42 PM, Jeff Squyres (jsquyres) <
jsquy...@cisco.com> wrote:

> We talked about this on the call last week.
>
> I'm guessing we'll talk about this at the Feb dev meeting, but we need to
> think about this a bit before hand.  Here's a little more fuel for the
> fire: let's at least specify the problem space a bit more precisely...
>
> (this item is on the agenda for the Feb dev meeting, but we all need to
> think about this a little before then; it's a complicated set of issues)
>
> One (not-even-half-baked) idea that was raised on the call last week was
> the idea of 3 levels of specifying networks:
>
> 1. Automatic selection.  "mpirun a.out" -- OMPI does all the selection for
> the user.
> 2. High-level abstraction.  "mpirun <SOME NICE EASY-TO-UNDERSTAND CLI
> OPTIONS> a.out"
> 3. Low-level specification.  "mpirun --mca btl usnic,sm,self a.out"
>
> #1 and #3 already exist today: #1 is for most users, #3 is for OMPI
> experts.
>
> #2 is the new thing.  It's intended for those who have a clue about what
> they want, but they aren't necessarily OMPI or networking experts.  The
> trick is defining what <SOME NICE EASY-TO-UNDERSTAND CLI OPTIONS> is.
>
> The selection space is complicated -- it has (at least?) three dimensions:
>
> 1. First, we have network types:
>
> a. Ethernet
> b. InfiniBand
> c. uGNI
> d. InfiniPath
> e. OmniScale
> f. Shared memory
> g. SCIF
>
> 2. Second, we have network APIs:
>
> a. TCP
> b. usNIC (via libfabric)
> c. Verbs
> d. MXM
> e. uGNI
> f. PSM
> g. PSM2
> h. POSIX shared memory segments
> i. xpmem
> j. knem
> k. Linux CMA
> l. SCIF
>
> 3. Third, we have Open MPI networking layers:
>
> a. PML OB1 (and associated BTLs)
> b. PML CM (and associated MTL)
> c. PML BFO
> d. PML crcpw
> e. PML v
> f. PML Yalla
> g. PML UCX (soon)
>
> These three spaces can be combined in specific ways (E.g., Ethernet / TCP
> / PML OB1 + BTLs).
> BTLs have the added complication that multiple can be used in a single job.
> Some network types can be accessed through multiple combinations.
> Obviously, not all combinations are sensible (e.g., uGNI / PSM2 / PML
> Yalla).
>
> The Big Issues here are:
>
> - the user generally only knows about the first dimension: network type.
> - the OMPI networking layer names are generally not meaningful unless
> you're an OMPI expert.
>
> So how do we present a "simple" / "higher-level abstraction" for the
> average user?
>
>
>
> > On Oct 12, 2015, at 11:47 AM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> >
> > Rolf: can you add this to the agenda?
> >
> > We're now adding multiple ways to get to the same underlying network
> transport, and it's getting confusing for users (I've fielded several
> off-list questions from users about this issue).
> >
> > - MXM: can be accessed via Yalla, the MXM MTL, (soon) UCX, and (soon)
> libfabric
> > - PSM: can be accessed via the PSM MTL and libfabric
> > - verbs: can be accessed via the openib BTL and libfabric
> > - PSM2: ditto
> > - uGNI: can be accessed via the uGNI BTL, portals(4?), and (soon) UCX
> > - shared memory: can be accessed via sm, vader, and (soon) UCX
> >
> > But you can also look at this from a different perspective:
> >
> > - IB: can be used via Yalla, MXM MTL, UCX, libfabric (multiple ways)
> > - RoCE: can be used via ^^some (or all? I'm not sure) of these
> > - Cray: can be used via the uGNI BTL, portals(4?), and (soon) UCX
> >
> > ...what's a user supposed to use?
> >
> > And more specifically, how can a user enable or disable a specific type
> of network?  Or API?
> >
> > A recent (off list) example I had was a user who was frustrated trying
> to figure out how to disable all forms of MXM (note: this is a larger issue
> than just MXM).
> >
> > Bottom line: underlying networks can be accessed through multiple
> upper-layer APIs, and it creates both a mapping problem for the MPI
> implementation, and a usability issue for users trying to be specific about
> which network(s) they want the MPI implementation to use.
> >
> > I don't have a solution (or even a proposal) here.  This is something we
> need to think / talk about.
> >
> > --
> > Jeff Squyres
> > jsquy...@cisco.com
> > For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/10/18207.php
>



-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department               Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Re: [OMPI devel] Specifying networks/APIs for OMPI (was: topic for agenda)

Reply via email to