On Jan 11, 2008, at 1:29 PM, Ethan Mallove wrote:
On Fri, Jan/11/2008 12:49:50PM, Jeff Squyres wrote:
On Jan 10, 2008, at 10:29 AM, Josh Hursey wrote:
Since we are ramping up to a v1.3 release we want to visualization
to
support this effort. So we want to make sure that the visualization
will meet the development community's needs. We should probably ask
the devel-core list, but I thought I would start some of the
discussion here to make sure I am asking the right questions of the
group.
Sounds reasonable.
After a first go-round here, we might want to have a conversation
with
the OMPI RM's to get their input - that would still be a small group
to get targeted feedback on these questions.
This sounds good to me.
To start I have some basic questions:
- How does Open MPI determine that it is stable enough to release?
I personally have a Magic 8 Ball on my desk that I consult frequently
for questions like this. ;-)
Does it have an OMPI sticker on it? :)
It's a mix of many different metrics, actually:
- stuff unrelated to MTT results:
- how many trac tickets are open against that release and do we
care
- how urgent are the bug fixes that are included
- external requirements (e.g., get an OMPI release out to meet the
OFED release schedule)
- ...and probably others
I realize that this is just to complete the list, but we may be able
to (one day in the distant future) link some of the Trac tickets with
MTT testing. This would allow us, for example, to have a link from a
Trac ticket to a special MTT reporter page that show how well testing
for this bug is going, and who is testing it (or working on it). Just
something to kick around, but it might be neat if MTT and Trac could
play better together one day.
- related to MTT results
- "good" coverage on platforms (where "platform" = host arch, OS,
OS version, compiler, compiler version, MCA params, interconnect, and
scheduler -- note that some of these are orthogonal from each
other...)
I think this is the one we are going to focus on in this first pass.
- the only failures and timeouts we have are a) repeatable, b)
consistent across multiple organizations (if relevant), and deemed to
be acceptable
We might be able to help highlight this situation. I'll have to think
about it a bit more.
- What dimensions of testing are most/least important (i.e.,
platforms, compilers, feature sets, scale, ...)?
This is a hard question. :-\ I listed several dimensions above:
- host architecture
- OS
- OS version
- compiler
- compiler version
- MCA parameters used
- interconnect
- scheduler
Here's some more:
- number of processes tested
- layout of processes (by node, by proc, ...etc.)
I don't quite know how to order those in terms of priority. :-\
I think that for some of these characteristics it will be feature
dependent. We may end up with a few lists:
- General acceptance for all the normal chases and default feature
sets.
- A set of configurations that must pass for opt-in feature X
- A set of configurations that must pass for opt-in feature Y
- ...
Each list may have a different visualization associated with it. So we
can say that in the normal use case everything is fine, but when we
test with feature X then these N tests fail. Then we can determine if
feature X is important enough to delay release.
- What other questions would be useful to answer with regard to
testing (thinking completely outside of the box)?
* Example: Are we testing a specific platform/configuration set
too much/too little?
This is a great question.
I would love to be able to configure this question -- e.g., are we
testing some MCA params too much/too little.
This is the one question we have been talking about the most. With the
visualization that Joseph was talking with me about it seems like a
natural fit. It would help us to determine how to best organize our
testing efforts so we don't waist time over testing something while
under testing something else.
The performance stuff can always be visualized better, especially
over
time. One idea is expressed in https://svn.open-mpi.org/trac/mtt/ticket/330
.
I also very much like the ideas in https://svn.open-mpi.org/trac/mtt/ticket/236
and https://svn.open-mpi.org/trac/mtt/ticket/302 (302 is not
expressed as a visualization issue, but it could be -- you can
imagine
a tree-based display showing the relationships between phase results,
perhaps even incorporated with a timeline -- that would be awesome).
These are good ideas.
Here's a whacky idea -- can our MTT data be combined with SCM data
(SVN, in this case) to answer questions like:
- what parts of the code are the most troublesome? i.e., when this
part of the code changes, these tests tend to break
- what tests seem to be related to what parts of the OMPI code base?
- who / what SVN commit(s) seemed to cause specific tests to break?
(this seems like a longer-term set of questions, but I thought I'd
bring it up...)
I like this idea :-)
A level of redirection missing to do this is keying SVN r
numbers to files modified. We also need to be able to
somehow track *new* failures (see
https://svn.open-mpi.org/trac/mtt/ticket/70). E.g., "was it
*this* revision that broke test xyz or was it an older one?"
-Ethan
This is a neat idea, and certainly possible. This may be easier than
one would expect. I know Joseph has a fair amount of experience mining
similar Sourceforge data to answer some related questions, so he may
have some ideas here.
I CC'ed Joseph on this email so he can see some of the questions being
posed. Joseph feel free to subscribe to the mtt-devel list if you want
to. It is (I believe) just Ethan, Jeff, and myself and is fairly low
traffic.
Keep the suggestions coming if you think of any more.
Cheers,
Josh
_______________________________________________
mtt-devel mailing list
mtt-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel
_______________________________________________
mtt-devel mailing list
mtt-de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel