Re: [MTT devel] MTT Visualization

2008-01-11 Thread Jeff Squyres

On Jan 10, 2008, at 10:29 AM, Josh Hursey wrote:


I met with Joseph Cottam (Grad student in my lab at IU) yesterday
about MTT visualization. He is working on some new visualization
techniques and wants to apply them to the MTT dataset.


Awesome.


Since we are ramping up to a v1.3 release we want to visualization to
support this effort. So we want to make sure that the visualization
will meet the development community's needs. We should probably ask
the devel-core list, but I thought I would start some of the
discussion here to make sure I am asking the right questions of the
group.


Sounds reasonable.

After a first go-round here, we might want to have a conversation with  
the OMPI RM's to get their input - that would still be a small group  
to get targeted feedback on these questions.



To start I have some basic questions:
 - How does Open MPI determine that it is stable enough to release?


I personally have a Magic 8 Ball on my desk that I consult frequently  
for questions like this.  ;-)


It's a mix of many different metrics, actually:

- stuff unrelated to MTT results:
   - how many trac tickets are open against that release and do we care
   - how urgent are the bug fixes that are included
   - external requirements (e.g., get an OMPI release out to meet the  
OFED release schedule)

   - ...and probably others
- related to MTT results
   - "good" coverage on platforms (where "platform" = host arch, OS,  
OS version, compiler, compiler version, MCA params, interconnect, and  
scheduler -- note that some of these are orthogonal from each other...)
   - the only failures and timeouts we have are a) repeatable, b)  
consistent across multiple organizations (if relevant), and deemed to  
be acceptable



 - What dimensions of testing are most/least important (i.e.,
platforms, compilers, feature sets, scale, ...)?


This is a hard question.  :-\  I listed several dimensions above:

- host architecture
- OS
- OS version
- compiler
- compiler version
- MCA parameters used
- interconnect
- scheduler

Here's some more:

- number of processes tested
- layout of processes (by node, by proc, ...etc.)

I don't quite know how to order those in terms of priority.  :-\


 - What other questions would be useful to answer with regard to
testing (thinking completely outside of the box)?
   * Example: Are we testing a specific platform/configuration set
too much/too little?


This is a great question.

I would love to be able to configure this question -- e.g., are we  
testing some MCA params too much/too little.


The performance stuff can always be visualized better, especially over  
time. One idea is expressed in https://svn.open-mpi.org/trac/mtt/ticket/330 
. 

I also very much like the ideas in https://svn.open-mpi.org/trac/mtt/ticket/236 
 and https://svn.open-mpi.org/trac/mtt/ticket/302 (302 is not  
expressed as a visualization issue, but it could be -- you can imagine  
a tree-based display showing the relationships between phase results,  
perhaps even incorporated with a timeline -- that would be awesome).


Here's a whacky idea -- can our MTT data be combined with SCM data  
(SVN, in this case) to answer questions like:


- what parts of the code are the most troublesome?  i.e., when this  
part of the code changes, these tests tend to break


- what tests seem to be related to what parts of the OMPI code base?

- who / what SVN commit(s) seemed to cause specific tests to break?

(this seems like a longer-term set of questions, but I thought I'd  
bring it up...)



 - Other questions you think we should pose to the group?

We are currently feeling out the domain of possibilities, but hope to
start doing some sketching some ideas in another week or so. This work
should proceed fairly quickly since we are targeting a paper about
this for the ACM Symposium on Software Visualization (http://www.softvis.org/
) which is due in early April. How is that for expecting success :)


Awesome.

--
Jeff Squyres
Cisco Systems



Re: [MTT devel] MTT Visualization

2008-01-11 Thread Ethan Mallove
On Fri, Jan/11/2008 12:49:50PM, Jeff Squyres wrote:
> On Jan 10, 2008, at 10:29 AM, Josh Hursey wrote:
> 
> > I met with Joseph Cottam (Grad student in my lab at IU) yesterday
> > about MTT visualization. He is working on some new visualization
> > techniques and wants to apply them to the MTT dataset.
> 
> Awesome.
> 
> > Since we are ramping up to a v1.3 release we want to visualization to
> > support this effort. So we want to make sure that the visualization
> > will meet the development community's needs. We should probably ask
> > the devel-core list, but I thought I would start some of the
> > discussion here to make sure I am asking the right questions of the
> > group.
> 
> Sounds reasonable.
> 
> After a first go-round here, we might want to have a conversation with  
> the OMPI RM's to get their input - that would still be a small group  
> to get targeted feedback on these questions.
> 
> > To start I have some basic questions:
> >  - How does Open MPI determine that it is stable enough to release?
> 
> I personally have a Magic 8 Ball on my desk that I consult frequently  
> for questions like this.  ;-)
> 
> It's a mix of many different metrics, actually:
> 
> - stuff unrelated to MTT results:
> - how many trac tickets are open against that release and do we care
> - how urgent are the bug fixes that are included
> - external requirements (e.g., get an OMPI release out to meet the  
> OFED release schedule)
> - ...and probably others
> - related to MTT results
> - "good" coverage on platforms (where "platform" = host arch, OS,  
> OS version, compiler, compiler version, MCA params, interconnect, and  
> scheduler -- note that some of these are orthogonal from each other...)
> - the only failures and timeouts we have are a) repeatable, b)  
> consistent across multiple organizations (if relevant), and deemed to  
> be acceptable
> 
> >  - What dimensions of testing are most/least important (i.e.,
> > platforms, compilers, feature sets, scale, ...)?
> 
> This is a hard question.  :-\  I listed several dimensions above:
> 
> - host architecture
> - OS
> - OS version
> - compiler
> - compiler version
> - MCA parameters used
> - interconnect
> - scheduler
> 
> Here's some more:
> 
> - number of processes tested
> - layout of processes (by node, by proc, ...etc.)
> 
> I don't quite know how to order those in terms of priority.  :-\
> 
> >  - What other questions would be useful to answer with regard to
> > testing (thinking completely outside of the box)?
> >* Example: Are we testing a specific platform/configuration set
> > too much/too little?
> 
> This is a great question.
> 
> I would love to be able to configure this question -- e.g., are we  
> testing some MCA params too much/too little.
> 
> The performance stuff can always be visualized better, especially over  
> time. One idea is expressed in https://svn.open-mpi.org/trac/mtt/ticket/330 
> .
> 
> I also very much like the ideas in 
> https://svn.open-mpi.org/trac/mtt/ticket/236 
>   and https://svn.open-mpi.org/trac/mtt/ticket/302 (302 is not  
> expressed as a visualization issue, but it could be -- you can imagine  
> a tree-based display showing the relationships between phase results,  
> perhaps even incorporated with a timeline -- that would be awesome).
> 
> Here's a whacky idea -- can our MTT data be combined with SCM data  
> (SVN, in this case) to answer questions like:
> 
> - what parts of the code are the most troublesome?  i.e., when this  
> part of the code changes, these tests tend to break
> 
> - what tests seem to be related to what parts of the OMPI code base?
> 
> - who / what SVN commit(s) seemed to cause specific tests to break?
> 
> (this seems like a longer-term set of questions, but I thought I'd  
> bring it up...)


I like this idea :-) 

A level of redirection missing to do this is keying SVN r
numbers to files modified. We also need to be able to
somehow track *new* failures (see
https://svn.open-mpi.org/trac/mtt/ticket/70). E.g., "was it
*this* revision that broke test xyz or was it an older one?"

-Ethan


> 
> >  - Other questions you think we should pose to the group?
> >
> > We are currently feeling out the domain of possibilities, but hope to
> > start doing some sketching some ideas in another week or so. This work
> > should proceed fairly quickly since we are targeting a paper about
> > this for the ACM Symposium on Software Visualization 
> > (http://www.softvis.org/
> > ) which is due in early April. How is that for expecting success :)
> 
> Awesome.
> 
> -- 
> Jeff Squyres
> Cisco Systems
> 
> ___
> mtt-devel mailing list
> mtt-de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel