Re: [MTT devel] MTT Visualization

Josh Hursey Fri, 11 Jan 2008 14:24:28 -0500


On Jan 11, 2008, at 1:29 PM, Ethan Mallove wrote:

On Fri, Jan/11/2008 12:49:50PM, Jeff Squyres wrote:

On Jan 10, 2008, at 10:29 AM, Josh Hursey wrote:

Since we are ramping up to a v1.3 release we want to visualizationto

support this effort. So we want to make sure that the visualization
will meet the development community's needs. We should probably ask
the devel-core list, but I thought I would start some of the
discussion here to make sure I am asking the right questions of the
group.


Sounds reasonable.

After a first go-round here, we might want to have a conversationwith

the OMPI RM's to get their input - that would still be a small group
to get targeted feedback on these questions.


This sounds good to me.

To start I have some basic questions:
- How does Open MPI determine that it is stable enough to release?


I personally have a Magic 8 Ball on my desk that I consult frequently
for questions like this.  ;-)


Does it have an OMPI sticker on it? :)



It's a mix of many different metrics, actually:

- stuff unrelated to MTT results:

- how many trac tickets are open against that release and do wecare

   - how urgent are the bug fixes that are included
   - external requirements (e.g., get an OMPI release out to meet the
OFED release schedule)
   - ...and probably others

I realize that this is just to complete the list, but we may be ableto (one day in the distant future) link some of the Trac tickets withMTT testing. This would allow us, for example, to have a link from aTrac ticket to a special MTT reporter page that show how well testingfor this bug is going, and who is testing it (or working on it). Justsomething to kick around, but it might be neat if MTT and Trac couldplay better together one day.


- related to MTT results
   - "good" coverage on platforms (where "platform" = host arch, OS,
OS version, compiler, compiler version, MCA params, interconnect, and

scheduler -- note that some of these are orthogonal from eachother...)


I think this is the one we are going to focus on in this first pass.


   - the only failures and timeouts we have are a) repeatable, b)
consistent across multiple organizations (if relevant), and deemed to
be acceptable

We might be able to help highlight this situation. I'll have to thinkabout it a bit more.

- What dimensions of testing are most/least important (i.e.,
platforms, compilers, feature sets, scale, ...)?


This is a hard question.  :-\  I listed several dimensions above:

- host architecture
- OS
- OS version
- compiler
- compiler version
- MCA parameters used
- interconnect
- scheduler

Here's some more:

- number of processes tested
- layout of processes (by node, by proc, ...etc.)

I don't quite know how to order those in terms of priority.  :-\

I think that for some of these characteristics it will be featuredependent. We may end up with a few lists:- General acceptance for all the normal chases and default featuresets.

 - A set of configurations that must pass for opt-in feature X
 - A set of configurations that must pass for opt-in feature Y
 - ...

Each list may have a different visualization associated with it. So wecan say that in the normal use case everything is fine, but when wetest with feature X then these N tests fail. Then we can determine iffeature X is important enough to delay release.

- What other questions would be useful to answer with regard to
testing (thinking completely outside of the box)?
  * Example: Are we testing a specific platform/configuration set
too much/too little?


This is a great question.

I would love to be able to configure this question -- e.g., are we
testing some MCA params too much/too little.

This is the one question we have been talking about the most. With thevisualization that Joseph was talking with me about it seems like anatural fit. It would help us to determine how to best organize ourtesting efforts so we don't waist time over testing something whileunder testing something else.

The performance stuff can always be visualized better, especiallyover

time. One idea is expressed in https://svn.open-mpi.org/trac/mtt/ticket/330
.

I also very much like the ideas in https://svn.open-mpi.org/trac/mtt/ticket/236
 and https://svn.open-mpi.org/trac/mtt/ticket/302 (302 is not

expressed as a visualization issue, but it could be -- you canimagine

a tree-based display showing the relationships between phase results,
perhaps even incorporated with a timeline -- that would be awesome).


These are good ideas.



Here's a whacky idea -- can our MTT data be combined with SCM data
(SVN, in this case) to answer questions like:

- what parts of the code are the most troublesome?  i.e., when this
part of the code changes, these tests tend to break

- what tests seem to be related to what parts of the OMPI code base?

- who / what SVN commit(s) seemed to cause specific tests to break?

(this seems like a longer-term set of questions, but I thought I'd
bring it up...)



I like this idea :-)

A level of redirection missing to do this is keying SVN r
numbers to files modified. We also need to be able to
somehow track *new* failures (see
https://svn.open-mpi.org/trac/mtt/ticket/70). E.g., "was it
*this* revision that broke test xyz or was it an older one?"

-Ethan

This is a neat idea, and certainly possible. This may be easier thanone would expect. I know Joseph has a fair amount of experience miningsimilar Sourceforge data to answer some related questions, so he mayhave some ideas here.

I CC'ed Joseph on this email so he can see some of the questions beingposed. Joseph feel free to subscribe to the mtt-devel list if you wantto. It is (I believe) just Ethan, Jeff, and myself and is fairly lowtraffic.


Keep the suggestions coming if you think of any more.

Cheers,
Josh

_______________________________________________
mtt-devel mailing list
mtt-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel

_______________________________________________
mtt-devel mailing list
mtt-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel

Re: [MTT devel] MTT Visualization

Reply via email to