Re: [OMPI devel] RFC: usnic BTL MPI_T pvar scheme

Paul Hargrove Tue, 5 Nov 2013 17:54:39 -0500 (EST)

Jeff,

If this approach is to be adopted by other components (and perhaps other
MPIs), then it would be important for the enumeration variable name to be
derived in a UNIFORM way:
    <framework>_<component>_SOMETHING
Without a fixed value for "SOMETHING" somebody will need to read sources
(or documentation) to make the connection.


In the slides you used "btl_usnic_devices", which seems overly specific
since a single NIC might have multiple PORTS making the "_devices" term
inappropriate/misleading (yes, it matches "device" in the sense of
/dev/foo, but not in the sense of a device as a physical object).  For tcp
on a multi-homed host "device" is again not necessarily the first word that
comes to mind for identifying the "interface" or listening address.
 Perhaps something nice and generic like "_instances", which is at least
consistent with the definition of "module" given at
http://www.open-mpi.org/faq/?category=developers#ompi-terminology

-Paul


On Tue, Nov 5, 2013 at 2:37 PM, Jeff Squyres (jsquyres)
<jsquy...@cisco.com>wrote:

> WHAT: suggestion for how to expose multiple MPI_T pvar values for a given
> variable.
>
> WHY: so that we have a common convention across OMPI (and possibly set a
> precedent for other MPI implementations...?).
>
> WHERE: ompi/mca/btl/usnic, but if everyone likes it, potentially elsewhere
> in OMPI
>
> TIMEOUT: before 1.7.4, so let's set a first timeout of next Tuesday
> teleconf (Nov  12)
>
> More detail:
> ------------
>
> Per my discussion on the call today, I'm sending the attached PPT of how
> we're exposing MPI_T performance variables in the usnic BTL in the
> multi-BTL case.
>
> Feedback is welcome, especially because we're the first MPI implementation
> to expose MPI_T pvars in this way (already committed on the trunk and
> targeted for 1.7.4).  So this methodology may well become a useful
> precedent.
>
> ** Issue #1: we want to expose each usnic BTL pvar (e.g.,
> btl_usnic_num_sends) on a per-usnic-BTL-*module* basis.  How to do this?
>
> 1. Add a prefix/suffix on each pvar name (e.g., btl_usnic_num_sends_0,
> btl_usnic_num_sends_1, ...etc.).
> 2. Return an array of values under the single name (btl_usnic_num_sends)
> -- one value for each BTL module.
>
> We opted for the 2nd option.  The MPI_T pvar interface provides a way to
> get the array length for a pvar, so this is all fine and good.
>
> Specifically: btl_usnic_num_sends returns an array of N values, where N is
> the number of usnic BTL modules being used by the MPI process.  Each slot
> in the array corresponds to the value from one usnic BTL module.
>
> ** Issue #2: but how do you map a given value to an underlying Linux usnic
> interface?
>
> Our solution was twofold:
>
> 1. Guarantee that the ordering of values in all pvar arrays is the same
> (i.e., usnic BTL module 0 will always be in slot 0, usnic BTL module 1 will
> always be in slot 1, ...etc.).
>
> 2. Add another pvar that is an MPI_T state variable with an associated
> MPI_T "enumeration", which contains string names of the underlying Linux
> devices.  This allows you to map a given value from a pvar to an underlying
> Linux device (e.g., from usnic BTL module 2 to /dev/usnic_3, or whatever).
>
> See the attached PPT.
>
> If people have no objection to this, we should use this convention across
> OMPI (e.g., for other BTLs that expose MPI_T pvars).
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>



-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Re: [OMPI devel] RFC: usnic BTL MPI_T pvar scheme

Reply via email to