Re: [isabelle-dev] Poly/ML x86_64_32 available for testing

2019-01-25 Thread Bertram Felgenhauer
Bertram Felgenhauer wrote:
> Makarius wrote:
> > So this is the right time for further testing of applications:
> > Isabelle2018 should work as well, but I have not done any testing beyond
> > "isabelle build -g main" -- Isabelle development only moves forward in
> > one direction on a single branch.
>
> I have tried this with Isabelle2018 and IsaFoR; I've encountered no
> problems and there's a nice speedup (estimated 1.25 times faster).
> Heap images are 40% smaller, which is a welcome change as well.

I have gathered some more data (below). There are some curiosities:

- Code export (that's the only thing that the 'Code' session does,
  besides theory merges) seems to be slower with polyml-a444f281ccec
  compared to polyml 5.7.1, even when sticking to the x86_64 platform.

- A similar slowdown seems to affect the IsaFoR_3 session, but I don't
  know what's special about that one.

- While most heap images are about 40% smaller with x86_64_32, this is
  not always the case; some heap images ended up being even larger in
  this experiment. Could there be a problem with the sharing pass on
  x86_64_32?

Cheers,

Bertram

Hardware:
  i7-6850K CPU @ 3.60GHz (6 cores x 2 hyperthreads) / 32GB RAM / SSD

Software:
  GNU/Linux
  Isabelle2018
  afp-2018 @ 400c722462b3
  IsaFoR @ acb9096ce08a

Common build flags:
  ISABELLE_BUILD_OPTIONS="threads=12 parallel_proofs=2"

Configurations:
  polyml 5.7.1x86_64ML_OPTIONS="--maxheap 24G -H 500 --gcthreads 6"
  polyml-a444f281ccec x86_64ML_OPTIONS="--maxheap 24G -H 500 --gcthreads 6"
  polyml-0a6ebca445fc x86_64_32 ML_OPTIONS="--maxheap 16G -H 500 --gcthreads 6"
  polyml-a444f281ccec x86_64_32 ML_OPTIONS="--maxheap 16G -H 500 --gcthreads 6"

Timings:
  Finished Pure (0:00:11 elapsed time, 0:00:14 cpu time, factor 1.20)
  Finished Pure (0:00:15 elapsed time, 0:00:17 cpu time, factor 1.18)
  Finished Pure (0:00:13 elapsed time, 0:00:15 cpu time, factor 1.14)
  Finished Pure (0:00:13 elapsed time, 0:00:15 cpu time, factor 1.13)

  Finished HOL (0:02:46 elapsed time, 0:10:50 cpu time, factor 3.91)
  Finished HOL (0:02:41 elapsed time, 0:10:05 cpu time, factor 3.75)
  Finished HOL (0:02:32 elapsed time, 0:09:46 cpu time, factor 3.84)
  Finished HOL (0:02:28 elapsed time, 0:09:20 cpu time, factor 3.78)

  Finished HOL-Lib (0:05:00 elapsed time, 0:35:54 cpu time, factor 7.18)
  Finished HOL-Lib (0:05:19 elapsed time, 0:36:41 cpu time, factor 6.88)
  Finished HOL-Lib (0:05:10 elapsed time, 0:35:32 cpu time, factor 6.88)
  Finished HOL-Lib (0:04:57 elapsed time, 0:34:16 cpu time, factor 6.91)

  Finished HOL-AFP (0:05:59 elapsed time, 0:37:39 cpu time, factor 6.28)
  Finished HOL-AFP (0:06:15 elapsed time, 0:38:31 cpu time, factor 6.15)
  Finished HOL-AFP (0:05:41 elapsed time, 0:35:44 cpu time, factor 6.28)
  Finished HOL-AFP (0:05:33 elapsed time, 0:35:01 cpu time, factor 6.30)

  Finished IsaFoR_1 (0:05:46 elapsed time, 0:31:43 cpu time, factor 5.49)
  Finished IsaFoR_1 (0:05:42 elapsed time, 0:31:06 cpu time, factor 5.46)
  Finished IsaFoR_1 (0:05:21 elapsed time, 0:29:58 cpu time, factor 5.59)
  Finished IsaFoR_1 (0:05:17 elapsed time, 0:29:15 cpu time, factor 5.54)

  Finished IsaFoR_2 (0:06:07 elapsed time, 0:29:37 cpu time, factor 4.84)
  Finished IsaFoR_2 (0:06:04 elapsed time, 0:29:15 cpu time, factor 4.82)
  Finished IsaFoR_2 (0:05:40 elapsed time, 0:27:39 cpu time, factor 4.87)
  Finished IsaFoR_2 (0:05:59 elapsed time, 0:28:14 cpu time, factor 4.72)

  Finished IsaFoR_3 (0:07:30 elapsed time, 0:47:56 cpu time, factor 6.38)
  Finished IsaFoR_3 (0:09:58 elapsed time, 0:55:27 cpu time, factor 5.56) ?
  Finished IsaFoR_3 (0:06:45 elapsed time, 0:43:32 cpu time, factor 6.45)
  Finished IsaFoR_3 (0:06:49 elapsed time, 0:43:20 cpu time, factor 6.35)

  Finished IsaFoR_4 (0:09:40 elapsed time, 0:30:25 cpu time, factor 3.15)
  Finished IsaFoR_4 (0:09:14 elapsed time, 0:29:16 cpu time, factor 3.17)
  Finished IsaFoR_4 (0:08:47 elapsed time, 0:27:52 cpu time, factor 3.17)
  Finished IsaFoR_4 (0:08:52 elapsed time, 0:27:57 cpu time, factor 3.15)

  Finished Code (0:05:30 elapsed time, 0:05:42 cpu time, factor 1.03)
  Finished Code (0:08:34 elapsed time, 0:08:46 cpu time, factor 1.02) ?
  Finished Code (0:07:59 elapsed time, 0:08:08 cpu time, factor 1.02)
  Finished Code (0:08:00 elapsed time, 0:08:08 cpu time, factor 1.02)

  total 00:48:29
  total 00:54:02
  total 00:48:08
  total 00:48:08

Heap image sizes (factor):
   28081037 Pure (1.00)
   25131549 Pure (0.89)
   19271077 Pure (0.68)
   19270685 Pure (0.68)

  410972244 HOL (1.00)
  403934084 HOL (0.98)
  249583659 HOL (0.61)
  249808871 HOL (0.61)

  525949867 HOL-Lib (1.00)
  526132427 HOL-Lib (1.00)
  308088218 HOL-Lib (0.59)
  307456194 HOL-Lib (0.58)

  913422879 HOL-AFP (1.00)
  915450415 HOL-AFP (1.00)
  537947934 HOL-AFP (0.59)
  538445922 HOL-AFP (0.59)

  724672759 IsaFoR_1 (1.00)
  724822231 IsaFoR_1 (1.00)
  428960922 IsaFoR_1 (0.59)
  429004970 IsaFoR_1 (0.59)

  623083120 IsaFoR_2 (1.00)
  

Re: [isabelle-dev] Poly/ML x86_64_32 available for testing

2019-01-25 Thread Fabian Immler

Great, thanks for looking into this!
Fabian

On 1/23/2019 4:24 PM, David Matthews wrote:

On 23/01/2019 21:08, Makarius wrote:

On 23/01/2019 12:05, David Matthews wrote:


I've had a look at this under Windows and the problem seems to be with
calls to IntInf.divMod from generated code, not from IntInf.pow.  The
underlying RTS call used by IntInf.quotRem has changed in the 32-in-64
version.  It previously returned the pair of values by passing in a
stack location and having the RTS update this.  That isn't possible in
the 32-in-64 version so now it allocates a pair on the heap.  For
simplicity this is now used for the native code versions as well.  From
what I can tell nearly all the threads are waiting for mutexes and I
suspect that the extra heap allocation in IntInf.quotRem is causing some
of the problem.  Waiting for a contended mutex can cost a significant
amount of processor time both in busy-waiting and in kernel calls.

I'm not sure what to suggest except not to use concurrency here.  There
doesn't seem to be a way to avoid allocating a pair in IntInf.quotRem.


In the meantime David has produced Poly/ML a444f281ccec and I can
confirm that it works very well:


I had another look and it was a mutex contention problem, just not 
exactly where I'd thought.  Most calls to the run-time system involved 
taking a lock for a very short period in order to get the thread's C++ 
data structure.  This wasn't a problem in the vast majority of cases but 
this example involves very many calls to the long-format arbitrary 
precision package.  That resulted in contention on the lock.  I've 
changed this so the lock is no longer needed and it seems to have solved 
the problem.


David




smime.p7s
Description: S/MIME Cryptographic Signature
___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev


Re: [isabelle-dev] Poly/ML x86_64_32 available for testing (polyml-test-a444f281ccec)

2019-01-25 Thread Makarius
On 23/01/2019 23:44, Makarius wrote:
> Isabelle/20bc1d26c932 now provides an updated polyml-test-a444f281ccec
> (active by default).
> 
> It performs slightly better than the previous test version -- I have
> also removed old workarounds for integer arithmetic in
> Isabelle/4591221824f6.

One more performance result in this respect:

Isabelle/4591221824f6 + AFP/2eacf2ed7d5d

x86_64_32-linux --minheap 1500 threads=8

Finished Flyspeck-Tame (3:37:32 elapsed time, 5:32:38 cpu time, factor 1.53)

This is a about two times better than what we were used to. It is a
combination of high-end software (current Poly/ML) with high-end
hardware (Intel Xeon Gold 6148 CPU @ 2.40GHz, 20 CPU cores, 64 GB
memory, SSD).


Makarius
___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev


Re: [isabelle-dev] Poly/ML x86_64_32 available for testing

2019-01-25 Thread Tobias Nipkow
This is really phantastic - at last I can build HOL-Analysis on my terrible new 
Mac without having to put it in the freezer and it does not fall over at the 
very end. It is also 20% faster.


Great work!
Tobias

On 19/01/2019 21:43, Makarius wrote:

Thanks to great work by David Matthews, there is now an Isabelle
component polyml-test-0a6ebca445fc (Isabelle/3b777286c3ec), which can be
enabled manually, e.g. in $ISABELLE_HOME_USER/etc/settings like this:

   init_component "$HOME/.isabelle/contrib/polyml-test-0a6ebca445fc"

It requires the usual "isabelle components -a" incantation afterwards.


This supports 3 platform variants:

   x86
   x86_64
   x86_64_32

x86 is still there for comparison, but will disappear soon.

x86_64 is the full 64-bit model as before, and subject to the system
option "ML_system_64 = true".

x86_64_32 is the default. It is a synthesis of the x86_64 architecture
(more memory, more registers etc.) with a 32-bit value model for ML.
Thus we get approx. 5 times more memory as in x86, without the penalty
of full 64-bit values.

Moreover, we have a proper "64-bit application" according to Apple (and
Linux distributions): it is getting increasingly hard to run old x86
executables, and soon it might be hardly possible at all. In other
words, Poly/ML is now getting ready for many more years to come.


The above component already works smoothly for all of Isabelle + AFP,
only with spurious drop-outs that can happen rarely. x86_64_32 is
already more stable than x86, which often suffers from out-of-memory
situations.


So this is the right time for further testing of applications:
Isabelle2018 should work as well, but I have not done any testing beyond
"isabelle build -g main" -- Isabelle development only moves forward in
one direction on a single branch.

For really big applications, it might occasionally help to use something
like "--maxheap 8g" in ML_OPTIONS: the implicit maximum is 16g, which is
sometimes too much for many parallel jobs on mid-range hardware. There
are additional memory requirements outside the ML heap, for program code
and stacks.


Makarius
___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev





smime.p7s
Description: S/MIME Cryptographic Signature
___
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev