Re: [isabelle-dev] [158c513a39f5] JVM crash
> You could try with a fresh build of Poly/ML: Okay, I rebuilt Poly/ML (not libsha1 and libgmp though; apparently your build script does not build those). The crashes still occured though. I think I did everything correctly, because when I delete the files that the compilation created, isabelle build fails immediately with some ML-related error message, so it must have been using them. > It should be sufficient to remove the is_pure() here: That does indeed seem to make the problem go away completely. At least I got about 50 consecutive builds without any crashes now. Manuel On 2017-11-08 15:35, Makarius wrote: > On 08/11/17 09:21, Manuel Eberl wrote: >> After a lengthy bisection, I found that the first revision where no >> crashes occur is this one: >> >> changeset: 66920:aefaaef29c58 >> user: wenzelm >> date: Thu Oct 26 13:44:41 2017 +0200 >> summary: use Poly/ML 5.7.1 test version as default; >> Does Poly/ML 5.7.1 contain any changes that could plausibly cause this >> bad behaviour to go away? > One side-condition that has also changed is the build platform: for the > various Poly/ML 5.6 Isabelle components it was still Ubuntu 10.04 LTS, > for current Poly/ML 5.7.1 test versions it is Ubuntu 12.04 LTS. > > Sometimes there are problems in the C/C++ compiler that disappear over > time. You could try with a fresh build of Poly/ML: the README in the > component contains brief instructions how to operate on the included src > directory. > > > Makarius ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
[isabelle-dev] Formal dependency on "poly" executable
See now: changeset: 67033:09fb749d1a1e user:wenzelm date:Wed Nov 08 17:34:32 2017 +0100 files: src/Pure/Pure.thy description: formal dependency on "poly" executable; I've myself got into problems finding odd spaces to remove from Pure ML sources, in order to force a rebuild after changing the Poly/ML test version. The $POLYML_EXE is from Isabelle/8176914dae84. When testing older Poly/ML versions, POLYML_EXE="$ML_HOME/poly" is required in $ISABELLE_HOME_USER/etc/settings. Makarius ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] [158c513a39f5] JVM crash
On 08/11/17 15:45, Manuel Eberl wrote: > Is there an easy way to disable that for testing purposes? Some line I > have to remove from a .scala file or something? It should be sufficient to remove the is_pure() here: http://isabelle.in.tum.de/repos/isabelle/annotate/3ff88fece1f6/src/Pure/Tools/build.scala#l522 Makarius ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] [158c513a39f5] JVM crash
Is there an easy way to disable that for testing purposes? Some line I have to remove from a .scala file or something? Manuel On 2017-11-08 15:44, Makarius wrote: > On 08/11/17 15:39, Manuel Eberl wrote: >>> If these crashes are happening at the end of the build process I would >>> suspect that it is something to do with either the data sharing or >>> writing out the heap image. >> Does writing out of the heap happen also when I just do "isabelle build >> Pure" as opposed to "isabelle build -b Pure"? Because the "-b" has no >> influence on whether it crashes or not. > Yes, Pure always produces a heap, independently of the -b option. > > > Makarius ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] [158c513a39f5] JVM crash
On 08/11/17 15:39, Manuel Eberl wrote: >> If these crashes are happening at the end of the build process I would >> suspect that it is something to do with either the data sharing or >> writing out the heap image. > > Does writing out of the heap happen also when I just do "isabelle build > Pure" as opposed to "isabelle build -b Pure"? Because the "-b" has no > influence on whether it crashes or not. Yes, Pure always produces a heap, independently of the -b option. Makarius ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] Slow builds due to excessive heap images
If there are fixed-size integers in ML that raise an exception on overflow, it should be possible to just do appropriate code_printing setup similar to what Code_Target_Numeral does. We already have to somehow magically map Isabelle's "int" type to ML's "IntInf" type, so all we have to do is to locally map it to ML's fixed size "int" instead, I think. Manuel On 2017-11-08 15:36, Tobias Nipkow wrote: > > > On 08/11/2017 14:13, Makarius wrote: >> On 08/11/17 12:35, Tobias Nipkow wrote: >>> >>> On 07/11/2017 23:13, Makarius wrote: For Flyspeck-Tame a small performance loss remains. It might be worth trying to configure the Isabelle/HOL codegen to use FixedInt instead of regular (unbounded) Int. Thus it could become faster with Poly/ML 5.7.1 than with 5.6. >>> >>> Just as for Isabelle itself, I don't want generated code to abort with >>> overflow or even worse to overflow silently. >> >> I also don't want to see FixedInt used routinely instead of proper >> mathematical Int. >> >> The idea above is to provide an option for the HOL codegen that is used >> specifically for applications like Flyspeck-Tame. It is mainly a >> question to codegen experts, if that can be done easily. > > Then I misunderstood. An optional use of FixedInt for languages where > overflow raises an exception is fine with me. > >> If the answer is "no", I personally don't mind. Flyspeck-Tame runs >> de-facto only in background builds: 1-2h more or less does not matter so >> much. Its classic runtime was actually 10h total, now we are at 7h. > > In which case I would say that providing such an option should be > balanced with the complexity it requires or introduces. > > Tobias > >> >> Makarius >> > > > > ___ > isabelle-dev mailing list > isabelle-...@in.tum.de > https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] [158c513a39f5] JVM crash
> If these crashes are happening at the end of the build process I would > suspect that it is something to do with either the data sharing or > writing out the heap image. Does writing out of the heap happen also when I just do "isabelle build Pure" as opposed to "isabelle build -b Pure"? Because the "-b" has no influence on whether it crashes or not. > It's impossible to be sure about any of this without running tests on > the hardware itself. I would gladly run any tests that you propose on my hardware. I could even give you SSH access to a live-CD-like system. Manuel ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] Slow builds due to excessive heap images
On 08/11/2017 14:13, Makarius wrote: On 08/11/17 12:35, Tobias Nipkow wrote: On 07/11/2017 23:13, Makarius wrote: For Flyspeck-Tame a small performance loss remains. It might be worth trying to configure the Isabelle/HOL codegen to use FixedInt instead of regular (unbounded) Int. Thus it could become faster with Poly/ML 5.7.1 than with 5.6. Just as for Isabelle itself, I don't want generated code to abort with overflow or even worse to overflow silently. I also don't want to see FixedInt used routinely instead of proper mathematical Int. The idea above is to provide an option for the HOL codegen that is used specifically for applications like Flyspeck-Tame. It is mainly a question to codegen experts, if that can be done easily. Then I misunderstood. An optional use of FixedInt for languages where overflow raises an exception is fine with me. If the answer is "no", I personally don't mind. Flyspeck-Tame runs de-facto only in background builds: 1-2h more or less does not matter so much. Its classic runtime was actually 10h total, now we are at 7h. In which case I would say that providing such an option should be balanced with the complexity it requires or introduces. Tobias Makarius smime.p7s Description: S/MIME Cryptographic Signature ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] [158c513a39f5] JVM crash
On 08/11/17 09:21, Manuel Eberl wrote: > > After a lengthy bisection, I found that the first revision where no > crashes occur is this one: > > changeset: 66920:aefaaef29c58 > user: wenzelm > date: Thu Oct 26 13:44:41 2017 +0200 > summary: use Poly/ML 5.7.1 test version as default; > Does Poly/ML 5.7.1 contain any changes that could plausibly cause this > bad behaviour to go away? One side-condition that has also changed is the build platform: for the various Poly/ML 5.6 Isabelle components it was still Ubuntu 10.04 LTS, for current Poly/ML 5.7.1 test versions it is Ubuntu 12.04 LTS. Sometimes there are problems in the C/C++ compiler that disappear over time. You could try with a fresh build of Poly/ML: the README in the component contains brief instructions how to operate on the included src directory. Makarius ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] [158c513a39f5] JVM crash
If these crashes are happening at the end of the build process I would suspect that it is something to do with either the data sharing or writing out the heap image. Writing the heap image does not involve any concurrency but the data sharing does, so I would suspect the latter. That has changed recently in 5.7.1 but so too has saving the heap. The fact that it only occurs on specific hardware points to some error in the memory model i.e. that the code is assuming something about the consistency of memory that is not in fact true. It's impossible to be sure about any of this without running tests on the hardware itself. At least the problem seems to have gone away with the latest version of Poly/ML. David On 08/11/2017 08:21, Manuel Eberl wrote: I was not able to find out what exactly crashes and why, but while trying to do that, I found out that the problem is actually not present in the development version of Isabelle anymore. After a lengthy bisection, I found that the first revision where no crashes occur is this one: changeset: 66920:aefaaef29c58 user: wenzelm date: Thu Oct 26 13:44:41 2017 +0200 summary: use Poly/ML 5.7.1 test version as default; So apparently, something changed in Poly/ML 5.7.1 that made the crashes go away. Older versions of Isabelle, i.e. Isabelle2016-1, Isabelle2017, and isabelle-dev up to 66919:1f93e376aeb6 /do/ exhibit the crash. It thus seems likely that whatever caused it was apparently present in multiple older Poly/ML versions. Note that the crash appears to be highly sensitive to the environment: Having two builds run in parallel with different Isabelle versions seems to make it considerably less frequent; however, only in Poly/ML 5.7.1 does it appear to not happen at all no matter how I run it. Does Poly/ML 5.7.1 contain any changes that could plausibly cause this bad behaviour to go away? Manuel On 2017-11-07 09:51, Manuel Eberl wrote: It seems like this thread is not dead yet. I had my CPU replaced by a new version that supposedly does not have the SMD problem on Linux. The problem with Isabelle did not go away. I still get reproducible crashes of "isabelle build -c Pure", but only with SMT switched on. However, it is worth noting that the crashes always seem to happen at the end of the build process. (A successful build takes about 9 seconds of elapsed and CPU time on my machine and unsuccessful ones always crash at at least that time) However, it is worth noting that some of the failed builds look like this: 0:00:30 elapsed time, 0:00:08 cpu time, factor 0.29 Also, the last line that is printed is always ### theory "ML_Bootstrap" so perhaps that also points to some specific problem. However, I don't think it's anything specific to Pure, since the crash is also reproducible with other sessions if I remember correctly (I just use Pure because it can be built quickly). To summarise: - crash happens randomly in about 6% of builds of the "Pure" session - crash seems specific to my Ryzen or at least my system - disabling SMT makes problem go away - crash reproducible on my Arch installation and a fresh install of Arch, but not on Ubuntu - RAM passes memtest without errors - I did not experience instability with any other software - My conjecture would be that it is somehow related to PolyML and multithreading I attached both the console output and the build log. Manuel On 04/09/17 15:12, Manuel Eberl wrote: Alas, it would appear I have spoken too soon! I experienced a strange build failure with RC1 yesterday and, fearing the worst, did my experiment from a few weeks ago again, with the following result: – building "Pure" fails in around 6 % of the cases – this does not change even after a cold reboot – switching SMT off seems to make the problem go away entirely – switching SMT on makes it reappear Sounds very much like this might well be the Ryzen bug. AMD has started replacing affected CPUs, so I shall enquire about that and see what happens. Cheers, Manuel ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] Slow builds due to excessive heap images
Hi Makarius, On 08/11/17 14:13, Makarius wrote: On 08/11/17 12:35, Tobias Nipkow wrote: On 07/11/2017 23:13, Makarius wrote: For Flyspeck-Tame a small performance loss remains. It might be worth trying to configure the Isabelle/HOL codegen to use FixedInt instead of regular (unbounded) Int. Thus it could become faster with Poly/ML 5.7.1 than with 5.6. Just as for Isabelle itself, I don't want generated code to abort with overflow or even worse to overflow silently. I also don't want to see FixedInt used routinely instead of proper mathematical Int. The idea above is to provide an option for the HOL codegen that is used specifically for applications like Flyspeck-Tame. It is mainly a question to codegen experts, if that can be done easily. My AFP entry Native_Word basically provides the setup for unsigned fixed-size integers, but as a separate type uintXX. The same could be done for signed fixed-size integers. The problem is how to get from int to uintXX. There are basically two options: 1. Just axiomatize that int are implemented with uintXX. This is potentially unsound. 2. Prove that actually no overflow occurs in the computations. The AFP entry Berlekamp-Zassenhaus does that for factoring polynomials over finite fields. The basic setup is there, but applying it to a particular instance requires quite some work. FixedInt has the additional challenge that the width is implementation dependent, so this requires a few tricks similar to the formalisation of uint in the AFP entry. In summary: In principle it could be possible to use FixedInt in Flyspec-Tame in a sound way, but it would be quite a bit of work. Andreas ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
[isabelle-dev] Non-slow AFP sessions
Thanks to lots of performance tuning for Poly/ML 5.7.1, both HOL-ODE-Numerics and HOL-ODE-Examples have become non-slow (see AFP/13b569160947). Here is the reference timing on rather old lxbroy7, which is also a regular test machine for http://isabelle.in.tum.de/devel/build_status/AFP: Finished HOL-ODE-Numerics (1:07:18 elapsed time, 1:07:17 cpu time, factor 1.00) Finished HOL-ODE-Examples (1:05:43 elapsed time, 1:05:42 cpu time, factor 1.00) Approx. 1h CPU time is de-facto the upper bound for non-slow sessions. On more current multicore hardware these sessions are much faster, especially when run in isolation on many cores. E.g. see the last measurements as "slow": http://isabelle.in.tum.de/devel/build_status/AFP_slow_64bit_6_threads/index.html#session_HOL-ODE-Examples The idea general idea of non-slow AFP sessions: anything that can be run routinely in interactive tests, e.g. after HOL library maintenance. It should all finish in approx. 1h on high-end hardware (such as lxcisa0). In the past 12 years, there has been a continuous challenge to keep up with the natural growth of AFP applications. So far we have always managed, and still do quite well. I am also glad that the recent years of "flying blind" are over: we have tangible performance data for all of AFP again, even a bit better than in the past (individual ML statistics and theory timing). Makarius ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] Slow builds due to excessive heap images
On 08/11/17 12:35, Tobias Nipkow wrote: > > On 07/11/2017 23:13, Makarius wrote: >> For Flyspeck-Tame a small performance loss remains. It might be worth >> trying to configure the Isabelle/HOL codegen to use FixedInt instead of >> regular (unbounded) Int. Thus it could become faster with Poly/ML 5.7.1 >> than with 5.6. > > Just as for Isabelle itself, I don't want generated code to abort with > overflow or even worse to overflow silently. I also don't want to see FixedInt used routinely instead of proper mathematical Int. The idea above is to provide an option for the HOL codegen that is used specifically for applications like Flyspeck-Tame. It is mainly a question to codegen experts, if that can be done easily. If the answer is "no", I personally don't mind. Flyspeck-Tame runs de-facto only in background builds: 1-2h more or less does not matter so much. Its classic runtime was actually 10h total, now we are at 7h. Makarius ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] Slow builds due to excessive heap images
On 07/11/2017 23:13, Makarius wrote: For Flyspeck-Tame a small performance loss remains. It might be worth trying to configure the Isabelle/HOL codegen to use FixedInt instead of regular (unbounded) Int. Thus it could become faster with Poly/ML 5.7.1 than with 5.6. Just as for Isabelle itself, I don't want generated code to abort with overflow or even worse to overflow silently. Tobias smime.p7s Description: S/MIME Cryptographic Signature ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev
Re: [isabelle-dev] [158c513a39f5] JVM crash
I was not able to find out what exactly crashes and why, but while trying to do that, I found out that the problem is actually not present in the development version of Isabelle anymore. After a lengthy bisection, I found that the first revision where no crashes occur is this one: changeset: 66920:aefaaef29c58 user: wenzelm date: Thu Oct 26 13:44:41 2017 +0200 summary: use Poly/ML 5.7.1 test version as default; So apparently, something changed in Poly/ML 5.7.1 that made the crashes go away. Older versions of Isabelle, i.e. Isabelle2016-1, Isabelle2017, and isabelle-dev up to 66919:1f93e376aeb6 /do/ exhibit the crash. It thus seems likely that whatever caused it was apparently present in multiple older Poly/ML versions. Note that the crash appears to be highly sensitive to the environment: Having two builds run in parallel with different Isabelle versions seems to make it considerably less frequent; however, only in Poly/ML 5.7.1 does it appear to not happen at all no matter how I run it. Does Poly/ML 5.7.1 contain any changes that could plausibly cause this bad behaviour to go away? Manuel On 2017-11-07 09:51, Manuel Eberl wrote: > It seems like this thread is not dead yet. > > I had my CPU replaced by a new version that supposedly does not have the > SMD problem on Linux. The problem with Isabelle did not go away. > > I still get reproducible crashes of "isabelle build -c Pure", but only > with SMT switched on. However, it is worth noting that the crashes > always seem to happen at the end of the build process. (A successful > build takes about 9 seconds of elapsed and CPU time on my machine and > unsuccessful ones always crash at at least that time) However, it is > worth noting that some of the failed builds look like this: > > 0:00:30 elapsed time, 0:00:08 cpu time, factor 0.29 > > Also, the last line that is printed is always > > ### theory "ML_Bootstrap" > > so perhaps that also points to some specific problem. However, I don't > think it's anything specific to Pure, since the crash is also > reproducible with other sessions if I remember correctly (I just use > Pure because it can be built quickly). > > To summarise: > - crash happens randomly in about 6% of builds of the "Pure" session > - crash seems specific to my Ryzen or at least my system > - disabling SMT makes problem go away > - crash reproducible on my Arch installation and a fresh install of > Arch, but not on Ubuntu > - RAM passes memtest without errors > - I did not experience instability with any other software > - My conjecture would be that it is somehow related to PolyML and > multithreading > > I attached both the console output and the build log. > > Manuel > > > On 04/09/17 15:12, Manuel Eberl wrote: >> Alas, it would appear I have spoken too soon! >> >> I experienced a strange build failure with RC1 yesterday and, fearing >> the worst, did my experiment from a few weeks ago again, with the >> following result: >> >> – building "Pure" fails in around 6 % of the cases >> – this does not change even after a cold reboot >> – switching SMT off seems to make the problem go away entirely >> – switching SMT on makes it reappear >> >> Sounds very much like this might well be the Ryzen bug. AMD has started >> replacing affected CPUs, so I shall enquire about that and see what happens. >> >> Cheers, >> >> Manuel ___ isabelle-dev mailing list isabelle-...@in.tum.de https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev