I think this sounds reasonable, if (and only if) MPI_Accumulate is properly handled. The interface for calling the op functions was broken in some fairly obvious way for accumulate when I was writing the one-sided code. I think I had to call some supposedly internal bits of the interface to make accumulate work. I can't remember what they are now, but I do remember it being a problem.

Of course, unless it makes mpi_allreduce on one double-sized floating point number using sum go faster, I'm not entirely sure a change is helpful ;).

Brian

On Mon, 5 Jan 2009, Jeff Squyres wrote:

WHAT: Converting the back-end of MPI_Op's to use components instead of hard-coded C functions.

WHY: To support specialized hardware (such as GPUs).

WHERE: Changes most of the MPI_Op code, adds a new ompi/mca/op framework.

WHEN: Work has started in an hg branch (http://www.open-mpi.org/hg/hgwebdir.cgi/jsquyres/cuda/).

TIMEOUT: Next Tuesday's teleconference, Jan 13 2008.

---------------------------------------

Note: I don't plan to finish the work by Jan 13; I just want to get a yea/nay from the community on the concept. Final review of the code before coming into the trunk can come later when I have more work to show / review.

Background: Today, the back-end MPI_Op functionality of (MPI_Op, MPI_Datatype) tuples are implemented as function pointers to a series of hard-coded C functions in the ompi/op/ directory.

*** NOTE: Since we already implement MPI_Op functionality via function pointer, this proposed extension is not expected to cause any performance difference in terms of OMPI's infrastructure.

Proposal: Extend the current implementation by creating a new framework ("op") that allows components to provide back-end MPI_Op functions instead of/in addition to the hard-coded C functions (we've talked about this idea before, but never done it).

The "op" framework will be similar to the MPI coll framework in that individual function pointers from multiple different modules can be mixed-n-matched. For example, if you want to write a new coll component that implements *only* a new MPI_BCAST algorithm, that coll component can be mixed-n-matched with other coll components at run time to get a full set of collective implementations on a communicator. A similar concept will be applied to the "op" framework. Case in point: some specialized hardware is only good at *some* operations on *some* datatypes; we'll need to fall back to the hard-coded C versions for all other tuples.

It is likely that the the "op" framework base will have all the hard-coded C "basic" MPI_Op functions that will always be available for fallback if a component is not used at run-time for a specialized implementation. Specifically: the intent is that components will be for specialized implementations.


Reply via email to