On Tue, 18 Oct 2011, Thomas Sewell wrote:

My email about PolyML and memory usage has generated some discussion, perhaps I should give some more context.

The continuing improvements of Poly/ML statistics add important new aspects to the overall performance monitoring of Isabelle performance, although this is an ongoing effort for many years already.


The L4.verified project has been stuck at Isabelle-2009-2 for years. Our proofs seem to be compatible with newer Isabelle versions, but we're running out of (virtual) memory. We've been running PolyML in 32 bit mode on most of our machines. As images grow beyond 4GB, we must switch to 64 bit mode, which roughly doubles memory use to > 8GB, which means replacing most of our laptops. Soon 16GB will be standard, I suppose.

Probably yes. Moore's Law for memory is still working. 5 years ago 4 GB was really huge, but now it is not so much, something you can assume for a "small" Laptop. A normal technical workstation of 2011 could be expected to have 16-32 GB.

Isabelle has traditionally targeted well-equipped workstations. In the past few years there was an unusual situation that the difference between laptops and servers was dimishied, but the gap is opening again due to hardware and marketing reasons (think of Smart phones, iPads, light Netbooks etc. opposed to huge "cloud" facilities).

I don't think it is realistic to follow the path of such mobile devices for local execution of the proof develpment environment. It is imaginable to do it remotely, though, say via some "proof cloud service". We have had this essentially in the past already: In the classic times of the Bali project, for example, it was commonplace to run the actual proof process in the background on a big server.


In the meanwhile I've been having a look at Isabelle memory use. I've never really understood why Isabelle images consume gigabytes of memory. With full proofs off, the only real inputs should be the statements of hypotheses and the definitions of types and terms. Isabelle's output is many orders of magnitude bigger than these inputs.

You probably mean heap size 1 GB, not image size -- the latter is fully shared an usually quite compact. Taking current hardware, 1 GB is not so much. Empirically, a software system always grows to fill up the available space. The question is what can be done with that, i.e. if it is put into proper use or just wasted. (I am also the one who periodically reminds people of Isabelle/HOL bloat, even though it builds faster on my machine than ever before in history: 2min 16s on 4 cores with maybe 4-8 GB.


Understanding the full details of memory usage is hard, often infeasible. After years of Isabelle maintanence and optimization, I have myself developed a certain mental model that approximates this complex situation.


Counting mere types and terms of the core logic is not realistic. The system is much more than the logic, even though everything happens to be reduced to some core calculus. And even the logic itself has a lot of not immediately visible overhead compared to rather compact source representation. As a rule of thumb the explicit type information is 10 times bigger than the overall term structure. You have already rediscovered some aspects of that in your recent profiling 4 measurement.


The initial thought was that the record package would be representative. The HOL image has grown because it has more in it, but the theorems generated in defining a large record stay roughly constant.
...
The record package uses multiple typedefs to simplify the types of some terms. In 2009-2 a typedef caused a theory to be checkpointed 8 times, in 2011-2 30 times.

This gives us a large collection of hypotheses, all of which are tedious to investigate. Please appreciate how valuable empirical data from PolyML is. Up until now we've been poking at these issues in the dark.

One more performance figure is good, but many other things have been measured over the years, with an ever improving tendency. So profiling 4 is just one more step.

Since a couple of years, my morning ritual is to look first at the charts that are generated from the isatest runs, e.g. see http://isabelle.in.tum.de/devel/stats/at-poly.html

Every tiny unexplained change in performance usually leads to some investigation of changesets, often with a private mail exchange with the people involved. Thus we have managed to keep an ever growing behemoth manageable, and actually devliver some performance to end-users, who in turn produce larger and larger applications to fill up the available space and time.


Here is a list of things that can be easily contributed externally to continue the continious efforts to manage the system, without postulating arbitrary quick changes to the incredibly complex Isabelle/Pure and main HOL/Tools:

  * Improve isatest to include additional measurements, say from profiling
    4, and the new (still experimental) Poly/ML Statistics facility.

  * Improve isatest to remove some assumptions about shared NFS home
    directories, to make it work reliably on vmbroy9 under Windows/Cygwin.
    Since that platform has more resource limitations, it will indirectly
    result in a leaner Isabelle system and library over time.

  * Improve Mira to take over the main statistics functions of old
    isatest.  So a better Mira could make isatest obsolete eventually.

  * Ensure that the big applications, which are now mostly in AFP, produce
    usable statistics.  Recently there was very little information coming
    from this source, as I have pointed out several times already.

  * Publish the L4.verified sources together with results from the
    checking process.  The closed nature of the application makes it
    much more subject to performance issue -- this then really means
    "poking in the dark".

  * Procure financial resources to have David Matthews improve the Poly/ML
    statistics facilities even further.  Not just for small mobile
    devices in 32bit mode, but also for bigger iron with large degree of
    parallelism. (This is technically rather delicate, and there are big
    companies making big money just with performance monitoring tools for
    certain platforms.)


        Makarius
_______________________________________________
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev

Reply via email to