This looks very nice! Looking forward to integrate the AFP.
Cheers, Gerwin On 30/05/2011, at 9:54 AM, Alexander Krauss wrote: > Hi all, > > In the past weeks, there has been some progress with our new testing > infrastructure, which I would like to summarize here. Please give > feedback, ask questions, and discuss. > > > Really fast status for "isabelle makeall" > ========================================= > > Thanks to a new fat server with 24 cores and 130G of RAM, we can now > run "isabelle makeall all" in 17-18 minutes. This is done on all > changesets as they come in, so we always know quickly when the tip of > the repository breaks. > > The reports with logs etc. can all be browsed at > > http://isabelle.in.tum.de/reports/Isabelle/ > > This is a fully functional repository view, annotated with test > results. > > The first column of flashlights represents the status of > Isabelle_makeall. > > > Developer-initiated tests > ========================= > > Apart from changesets from the official repository, the system also > considers changesets from a special "testboard" repository. All > developers can push their untested changes here, to get some automatic > feedback. Try it out, it is really convenient. > > To use testboard, it is convenient to add the following to your > ~/.hgrc > > [paths] > testboard = > ssh://usern...@macbroy20.informatik.tu-muenchen.de//home/isabelle-repository/repos/testboard > > Then, local changes can be pushed to testboard via > > hg push -f testboard > > and will usually be considered fairly quickly by the scheduler. > (Note: -f is required, since the testboard repository will usually > have multiple heads. Never use -f when pushing to the main development > repository!) > > The reports from testboard can be viewed at > > http://isabelle.in.tum.de/testboard/Isabelle > > Note that testboard should be considered write-only. You should never > pull from it unless you want to get everybody's unfinished stuff! > > > SML/NJ testing > ============== > > Building the HOL image takes ca. 3-4 hours, and a full makeall takes > ca. 24 hours. This has been prohibitive for pre-push testing with > SML/NJ, so people generally have fixed issues after getting a failure. > The new setup supports the same style of working but gives quicker > feedback. Both the HOL image and makeall are run continuously, so > there is much quicker feedback. > > The third column in the reports view signals the status of SML/NJ > builds. Note that the SML_makeall test still excludes a small number > of sessions which are too big to fit into memory. > > > AFP > === > > Currently, there is a reduced build of the AFP, which is run > continuously, but it still excludes Flyspeck_Tame and > JinjaThreads. When Gerwin is here, we'll try to make this build > equivalent to the official AFP builds. > > The results are signalled in the second column in the reports view. > > > Data collection > =============== > > Performance data is collected in every run. When Makarius is available > again, we must decide on a set of sessions to be run regularly in a > reproducible environment (i.e., without other jobs or interactive > activity on the same machine), to get less fluctuation in the > measurements. > > Judgement Day benchmarks are now run continuously on two lxlabbroy > machines. This is still somewhat preliminary, and I must coordinate > with Jasmin and Sascha what data should best be collected and how. > > Mutabelle benchmarks are run continuously on one lxlabbroy machine in > pretty much the same way. > > The most important weakness at the moment is the lack of good plotting > facilities for this sort of data. A student (Michael Kerscher) is > currently working on that, with promising first results. > > > Limitations > =========== > > - Missing email notification > > Here is some room for discussion: What would be a good notification > scheme? Since tests are now run continuously, sending an email after > each run is not really an option. Should we simply go for a daily > summary? More sophisticated options are possible (e.g., notify the > author of a broken changeset when all its parents are working), but > these require some thought. > > - Scheduling strategies could be better sometimes > - Plotting (see above) > - ... > > > > Alex > _______________________________________________ > isabelle-dev mailing list > isabelle-...@in.tum.de > https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev > _______________________________________________ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev