WHAT: Converting the back-end of MPI_Op's to use components instead of hard-coded C functions.

WHY: To support specialized hardware (such as GPUs).

WHERE: Changes most of the MPI_Op code, adds a new ompi/mca/op framework.

WHEN: Work has started in an hg branch (http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/cuda/ ).

TIMEOUT: Next Tuesday's teleconference, Jan 13 2008.

---------------------------------------

Note: I don't plan to finish the work by Jan 13; I just want to get a yea/nay from the community on the concept. Final review of the code before coming into the trunk can come later when I have more work to show / review.

Background: Today, the back-end MPI_Op functionality of (MPI_Op, MPI_Datatype) tuples are implemented as function pointers to a series of hard-coded C functions in the ompi/op/ directory.

*** NOTE: Since we already implement MPI_Op functionality via function pointer, this proposed extension is not expected to cause any performance difference in terms of OMPI's infrastructure.

Proposal: Extend the current implementation by creating a new framework ("op") that allows components to provide back-end MPI_Op functions instead of/in addition to the hard-coded C functions (we've talked about this idea before, but never done it).

The "op" framework will be similar to the MPI coll framework in that individual function pointers from multiple different modules can be mixed-n-matched. For example, if you want to write a new coll component that implements *only* a new MPI_BCAST algorithm, that coll component can be mixed-n-matched with other coll components at run time to get a full set of collective implementations on a communicator. A similar concept will be applied to the "op" framework. Case in point: some specialized hardware is only good at *some* operations on *some* datatypes; we'll need to fall back to the hard-coded C versions for all other tuples.

It is likely that the the "op" framework base will have all the hard- coded C "basic" MPI_Op functions that will always be available for fallback if a component is not used at run-time for a specialized implementation. Specifically: the intent is that components will be for specialized implementations.

--
Jeff Squyres
Cisco Systems

Reply via email to