subject:"\[isabelle\-dev\] New Testing Infrastructure \-\- status report"

Re: [isabelle-dev] New Testing Infrastructure -- status report

2011-05-31 Thread Gerwin Klein

This looks very nice!

Looking forward to integrate the AFP.

Cheers,
Gerwin

On 30/05/2011, at 9:54 AM, Alexander Krauss wrote:

Hi all,

In the past weeks, there has been some progress with our new testing
infrastructure, which I would like to summarize here. Please give
feedback, ask questions, and discuss.

Really fast status for isabelle makeall
=

Thanks to a new fat server with 24 cores and 130G of RAM, we can now
run isabelle makeall all in 17-18 minutes. This is done on all
changesets as they come in, so we always know quickly when the tip of
the repository breaks.

The reports with logs etc. can all be browsed at

http://isabelle.in.tum.de/reports/Isabelle/

This is a fully functional repository view, annotated with test
results.

The first column of flashlights represents the status of
Isabelle_makeall.

Developer-initiated tests
=

Apart from changesets from the official repository, the system also
considers changesets from a special testboard repository. All
developers can push their untested changes here, to get some automatic
feedback. Try it out, it is really convenient.

To use testboard, it is convenient to add the following to your
~/.hgrc

[paths]
testboard =
ssh://usern...@macbroy20.informatik.tu-muenchen.de//home/isabelle-repository/repos/testboard

Then, local changes can be pushed to testboard via

hg push -f testboard

and will usually be considered fairly quickly by the scheduler.
(Note: -f is required, since the testboard repository will usually
have multiple heads. Never use -f when pushing to the main development
repository!)

The reports from testboard can be viewed at

http://isabelle.in.tum.de/testboard/Isabelle

Note that testboard should be considered write-only. You should never
pull from it unless you want to get everybody's unfinished stuff!

SML/NJ testing
==

Building the HOL image takes ca. 3-4 hours, and a full makeall takes
ca. 24 hours. This has been prohibitive for pre-push testing with
SML/NJ, so people generally have fixed issues after getting a failure.
The new setup supports the same style of working but gives quicker
feedback. Both the HOL image and makeall are run continuously, so
there is much quicker feedback.

The third column in the reports view signals the status of SML/NJ
builds. Note that the SML_makeall test still excludes a small number
of sessions which are too big to fit into memory.

AFP
===

Currently, there is a reduced build of the AFP, which is run
continuously, but it still excludes Flyspeck_Tame and
JinjaThreads. When Gerwin is here, we'll try to make this build
equivalent to the official AFP builds.

The results are signalled in the second column in the reports view.

Data collection
===

Performance data is collected in every run. When Makarius is available
again, we must decide on a set of sessions to be run regularly in a
reproducible environment (i.e., without other jobs or interactive
activity on the same machine), to get less fluctuation in the
measurements.

Judgement Day benchmarks are now run continuously on two lxlabbroy
machines. This is still somewhat preliminary, and I must coordinate
with Jasmin and Sascha what data should best be collected and how.

Mutabelle benchmarks are run continuously on one lxlabbroy machine in
pretty much the same way.

The most important weakness at the moment is the lack of good plotting
facilities for this sort of data. A student (Michael Kerscher) is
currently working on that, with promising first results.

Limitations
===

- Missing email notification

Here is some room for discussion: What would be a good notification
scheme? Since tests are now run continuously, sending an email after
each run is not really an option. Should we simply go for a daily
summary? More sophisticated options are possible (e.g., notify the
author of a broken changeset when all its parents are working), but
these require some thought.

- Scheduling strategies could be better sometimes
- Plotting (see above)
- ...

Alex
___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev

___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev

Re: [isabelle-dev] New Testing Infrastructure -- status report

2011-05-30 Thread Lawrence Paulson

Awesome! It looks positively industrial in scale.
Larry

On 30 May 2011, at 08:54, Alexander Krauss wrote:

Hi all,

In the past weeks, there has been some progress with our new testing
infrastructure, which I would like to summarize here. Please give
feedback, ask questions, and discuss.

Really fast status for isabelle makeall
=

The reports with logs etc. can all be browsed at

http://isabelle.in.tum.de/reports/Isabelle/

This is a fully functional repository view, annotated with test
results.

The first column of flashlights represents the status of
Isabelle_makeall.

Developer-initiated tests
=

To use testboard, it is convenient to add the following to your
~/.hgrc

[paths]
testboard =
ssh://usern...@macbroy20.informatik.tu-muenchen.de//home/isabelle-repository/repos/testboard

Then, local changes can be pushed to testboard via

hg push -f testboard

The reports from testboard can be viewed at

http://isabelle.in.tum.de/testboard/Isabelle

Note that testboard should be considered write-only. You should never
pull from it unless you want to get everybody's unfinished stuff!

SML/NJ testing
==

The third column in the reports view signals the status of SML/NJ
builds. Note that the SML_makeall test still excludes a small number
of sessions which are too big to fit into memory.

AFP
===

The results are signalled in the second column in the reports view.

Data collection
===

Mutabelle benchmarks are run continuously on one lxlabbroy machine in
pretty much the same way.

The most important weakness at the moment is the lack of good plotting
facilities for this sort of data. A student (Michael Kerscher) is
currently working on that, with promising first results.

Limitations
===

- Missing email notification

- Scheduling strategies could be better sometimes
- Plotting (see above)
- ...

Alex
___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev

___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev

Re: [isabelle-dev] New Testing Infrastructure -- status report

2011-05-30 Thread Jasmin Christian Blanchette

Hi Alex,

 In the past weeks, there has been some progress with our new testing
 infrastructure, which I would like to summarize here. Please give
 feedback, ask questions, and discuss.

Great!! Thanks for the update. I just have one question related to testboard:

 Then, local changes can be pushed to testboard via
 
  hg push -f testboard

I use queues a lot and usually do all testing before I qfinish the queued 
patches. Is there a Mercurial trick to push all the applied queues without 
qfinishing them first?

Jasmin

___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev

Re: [isabelle-dev] New Testing Infrastructure -- status report

2011-05-30 Thread Alexander Krauss


  hg push -f testboard


I use queues a lot and usually do all testing before I qfinish the queued 
patches. Is there a Mercurial trick to push all the applied queues without qfinishing 
them first?


hg push -f actually does push applied mq patches as normal changesets, 
so it should do exactly what you want.


Alex

___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev

Re: [isabelle-dev] New Testing Infrastructure -- status report

Re: [isabelle-dev] New Testing Infrastructure -- status report

Re: [isabelle-dev] New Testing Infrastructure -- status report

Re: [isabelle-dev] New Testing Infrastructure -- status report

4 matches

Site Navigation

Mail list logo

Footer information