Re: [isabelle-dev] New Testing Infrastructure -- status report
This looks very nice! Looking forward to integrate the AFP. Cheers, Gerwin On 30/05/2011, at 9:54 AM, Alexander Krauss wrote: Hi all, In the past weeks, there has been some progress with our new testing infrastructure, which I would like to summarize here. Please give feedback, ask questions, and discuss. Really fast status for isabelle makeall = Thanks to a new fat server with 24 cores and 130G of RAM, we can now run isabelle makeall all in 17-18 minutes. This is done on all changesets as they come in, so we always know quickly when the tip of the repository breaks. The reports with logs etc. can all be browsed at http://isabelle.in.tum.de/reports/Isabelle/ This is a fully functional repository view, annotated with test results. The first column of flashlights represents the status of Isabelle_makeall. Developer-initiated tests = Apart from changesets from the official repository, the system also considers changesets from a special testboard repository. All developers can push their untested changes here, to get some automatic feedback. Try it out, it is really convenient. To use testboard, it is convenient to add the following to your ~/.hgrc [paths] testboard = ssh://usern...@macbroy20.informatik.tu-muenchen.de//home/isabelle-repository/repos/testboard Then, local changes can be pushed to testboard via hg push -f testboard and will usually be considered fairly quickly by the scheduler. (Note: -f is required, since the testboard repository will usually have multiple heads. Never use -f when pushing to the main development repository!) The reports from testboard can be viewed at http://isabelle.in.tum.de/testboard/Isabelle Note that testboard should be considered write-only. You should never pull from it unless you want to get everybody's unfinished stuff! SML/NJ testing == Building the HOL image takes ca. 3-4 hours, and a full makeall takes ca. 24 hours. This has been prohibitive for pre-push testing with SML/NJ, so people generally have fixed issues after getting a failure. The new setup supports the same style of working but gives quicker feedback. Both the HOL image and makeall are run continuously, so there is much quicker feedback. The third column in the reports view signals the status of SML/NJ builds. Note that the SML_makeall test still excludes a small number of sessions which are too big to fit into memory. AFP === Currently, there is a reduced build of the AFP, which is run continuously, but it still excludes Flyspeck_Tame and JinjaThreads. When Gerwin is here, we'll try to make this build equivalent to the official AFP builds. The results are signalled in the second column in the reports view. Data collection === Performance data is collected in every run. When Makarius is available again, we must decide on a set of sessions to be run regularly in a reproducible environment (i.e., without other jobs or interactive activity on the same machine), to get less fluctuation in the measurements. Judgement Day benchmarks are now run continuously on two lxlabbroy machines. This is still somewhat preliminary, and I must coordinate with Jasmin and Sascha what data should best be collected and how. Mutabelle benchmarks are run continuously on one lxlabbroy machine in pretty much the same way. The most important weakness at the moment is the lack of good plotting facilities for this sort of data. A student (Michael Kerscher) is currently working on that, with promising first results. Limitations === - Missing email notification Here is some room for discussion: What would be a good notification scheme? Since tests are now run continuously, sending an email after each run is not really an option. Should we simply go for a daily summary? More sophisticated options are possible (e.g., notify the author of a broken changeset when all its parents are working), but these require some thought. - Scheduling strategies could be better sometimes - Plotting (see above) - ... Alex ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] New Testing Infrastructure -- status report
Awesome! It looks positively industrial in scale. Larry On 30 May 2011, at 08:54, Alexander Krauss wrote: Hi all, In the past weeks, there has been some progress with our new testing infrastructure, which I would like to summarize here. Please give feedback, ask questions, and discuss. Really fast status for isabelle makeall = Thanks to a new fat server with 24 cores and 130G of RAM, we can now run isabelle makeall all in 17-18 minutes. This is done on all changesets as they come in, so we always know quickly when the tip of the repository breaks. The reports with logs etc. can all be browsed at http://isabelle.in.tum.de/reports/Isabelle/ This is a fully functional repository view, annotated with test results. The first column of flashlights represents the status of Isabelle_makeall. Developer-initiated tests = Apart from changesets from the official repository, the system also considers changesets from a special testboard repository. All developers can push their untested changes here, to get some automatic feedback. Try it out, it is really convenient. To use testboard, it is convenient to add the following to your ~/.hgrc [paths] testboard = ssh://usern...@macbroy20.informatik.tu-muenchen.de//home/isabelle-repository/repos/testboard Then, local changes can be pushed to testboard via hg push -f testboard and will usually be considered fairly quickly by the scheduler. (Note: -f is required, since the testboard repository will usually have multiple heads. Never use -f when pushing to the main development repository!) The reports from testboard can be viewed at http://isabelle.in.tum.de/testboard/Isabelle Note that testboard should be considered write-only. You should never pull from it unless you want to get everybody's unfinished stuff! SML/NJ testing == Building the HOL image takes ca. 3-4 hours, and a full makeall takes ca. 24 hours. This has been prohibitive for pre-push testing with SML/NJ, so people generally have fixed issues after getting a failure. The new setup supports the same style of working but gives quicker feedback. Both the HOL image and makeall are run continuously, so there is much quicker feedback. The third column in the reports view signals the status of SML/NJ builds. Note that the SML_makeall test still excludes a small number of sessions which are too big to fit into memory. AFP === Currently, there is a reduced build of the AFP, which is run continuously, but it still excludes Flyspeck_Tame and JinjaThreads. When Gerwin is here, we'll try to make this build equivalent to the official AFP builds. The results are signalled in the second column in the reports view. Data collection === Performance data is collected in every run. When Makarius is available again, we must decide on a set of sessions to be run regularly in a reproducible environment (i.e., without other jobs or interactive activity on the same machine), to get less fluctuation in the measurements. Judgement Day benchmarks are now run continuously on two lxlabbroy machines. This is still somewhat preliminary, and I must coordinate with Jasmin and Sascha what data should best be collected and how. Mutabelle benchmarks are run continuously on one lxlabbroy machine in pretty much the same way. The most important weakness at the moment is the lack of good plotting facilities for this sort of data. A student (Michael Kerscher) is currently working on that, with promising first results. Limitations === - Missing email notification Here is some room for discussion: What would be a good notification scheme? Since tests are now run continuously, sending an email after each run is not really an option. Should we simply go for a daily summary? More sophisticated options are possible (e.g., notify the author of a broken changeset when all its parents are working), but these require some thought. - Scheduling strategies could be better sometimes - Plotting (see above) - ... Alex ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] New Testing Infrastructure -- status report
Hi Alex, In the past weeks, there has been some progress with our new testing infrastructure, which I would like to summarize here. Please give feedback, ask questions, and discuss. Great!! Thanks for the update. I just have one question related to testboard: Then, local changes can be pushed to testboard via hg push -f testboard I use queues a lot and usually do all testing before I qfinish the queued patches. Is there a Mercurial trick to push all the applied queues without qfinishing them first? Jasmin ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] New Testing Infrastructure -- status report
hg push -f testboard I use queues a lot and usually do all testing before I qfinish the queued patches. Is there a Mercurial trick to push all the applied queues without qfinishing them first? hg push -f actually does push applied mq patches as normal changesets, so it should do exactly what you want. Alex ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev