Re: [OMPI devel] SC13 birds of a feather

Isaías A. Comprés Ureña Wed, 4 Dec 2013 06:25:48 -0500 (EST)

Dear Jeff Squyres,

On 12/03/2013 11:27 PM, Jeff Squyres (jsquyres) wrote:

I'm sorry; I really wasn't paying attention to my email the week of SC, and 
then I was on vacation for the Thanksgiving holiday.  :-\


More below.

On Nov 20, 2013, at 4:13 PM, Compres <compr...@in.tum.de> wrote:

I was at the birds of a feather and wanted to talk to the Open MPI developers, 
but unfortunately had to leave early.  In particular, I would like to discuss 
about your implementation of the MPI tools interface and possibly contribute to 
it later on.

Sorry we missed you.

No problem; I had to be at a booth during times that overlapped withyour session.

What did you want to discuss?  We actually have a full implementation of the 
MPI_T interface -- meaning that we have all the infrastructure in place for 
MPI_T control and performance variables.

1. The MPI_T control variables map directly to OMPI's MCA params, so we 
automatically expose oodles of cvars through MPI_T.  They're all read-only 
after MPI_INIT, however -- many things are setup during MPI_INIT and it would 
be quite a Big Deal if they were to change.  However, we pretty much *assumed* 
all cvars shouldn't change after INIT -- we didn't really audit to see if there 
were actually some cvars that could change after INIT. So there's work that 
could be done there (i.e., find cvars that could change after INIT, and/or 
evaluate the amount of work/change it would be to change some read-only cvars 
to be read-write, etc.).

2. The MPI_T performance variables are new.  There's only a few created right 
now (e.g., in the Cisco usnic BTL).  But the field is pretty wide open here -- 
the infrastructure is there, but we're really not exposing much information 
yet.  There's lots that can be done here.

What did you have in mind?

I think you made a good guess on what we would like to do here. We areworking on automatic tuning based on both modeling and empirical data.One of our aims is to accelerate the data collection part (in this caserelated to MPI settings), by doing it online without the need of fullapplication runs or restarts.

Right now we can modify MPI runtime parameters with IBM-MPI or OpenMPI. These require full restarts, since they are set as environmentvariables and are not modifiable after MPI_INIT. With your MPITimplementation, we can do the same programmatically but cannot avoid therestarts or full runs.

We already did what you describe at the end of 1., but with a (1 yearold) snapshot of MPICH. The idea was to identify which variables couldbe made modifiable at runtime, and whether there was any attainableperformance as a result of tuning them. We only explored point to pointand collective communication parameters, and the results areencouraging. There was no technical reason when picking MPICH for thefirst prototype.

With MPICH, we had to examine the code for things that wereconfigurable. It seems to me that in the case of Open MPI, most of thework is done and, as you point out, it may just be necessary to identifywhich ones can be made modifiable at runtime and at what development cost.

My main intention here is to see if other people are interested and willbenefit from this. Additionally, if the changes (patches) are taken bythe project, we avoid running out of sync (which is what ended uphappening with our MPICH modifications).


- Isaías A. Comprés

Re: [OMPI devel] SC13 birds of a feather

Reply via email to