Re: [isabelle-dev] mira disk usage

Makarius Mon, 03 Sep 2012 04:58:03 -0700

On Sat, 1 Sep 2012, Lars Noschinski wrote:

On 01.09.2012 01:15, Gerwin Klein wrote:
I guess the problem is that build -b produces heap images for all sessions.These are many, and each image is hundreds of MB. The previous setup hadimages for selected base sessions only and no images for the rest (almosteverything).
The "AFP" session in mira does only produce the base images, AFAICS:

$ ls AFP_c5dd6e9db74c4e9fadb373ec946c413a/polyml-5.4.1_x86_64-linux/
Collections                HOL-Nominal      Jinja               Pure
Group-Ring-Module          HOL-Probability  LatticeProperties Refine_Monadic
HOL                        HOL-Word         List-Infinite       Simpl
HOL-Multivariate_Analysis  HOLCF            Nat-Interval-Logic  log
ps: "isabelle build" takes about 5 sec on my system to figure outdependencies (I think). Should this be faster?

Isabelle build produces only the required images, i.e. "inner" nodes ofthe hierarchy. A more ambitious version would not even do that, forkingprocesses directly without going through image save/load first.

In the mira configurationhttp://isabelle.in.tum.de/repos/isabelle/file/7b6beb7e99c1/Admin/mira.py Idon't see redundant -b options, in agreement with the above observation ofimages found later in the file-system.

I guess this time is spent hashing the dependencies to figure out,whether they have changed -- at least, it is not time based.
I know that Git uses stat info as a first step and only if this changesproceeds reading the file.

Ever since I am taking care of theory loading and session management(approx. 1997) I am aware of the various ways how file identifies aremanaged, including their advantages and disadvantages. The "stat" modelis tied to a physical file-system, and was gradually phased from Isabellein the past couple of years. SHA1 computation on the whole file contentis fast enough, and has the advantage that it is well-defined fornon-local files, say via HTTP. Or files that are not files at all, just avector of characters sent from the front-end to the back-end.

The slowness of the preparatory stage of Isabelle build has a differentreason: reworking the old 'use' command to fit into the IDE model (withouta file-system present) I've realized that the prover can now tell aboutauxiliary file-dependencies, via the new 'ML_file' command. This requiresa full outer-parse of files that might contain such commands: it alsoexplains the renaming from unspecific 'use' to specific 'ML_file', becausethe latter does not occur by accident in source files so often.

This is a significant conceptual improvement, so the question if it ispossible to do this superficial parsing a bit faster can be sorted out inthe coming months. It might be just a matter to use some of Oderski's.par combinators.

It also indicates how huge the HOL image has now become, And how greatPoly/ML 5.5.0 is at digesting that so quickly.

Generally, Scala/JVM technology has a tradition of slow startup, but onceit is running it runs quite fast. When I start Proof General Emacs nowand again by accident, I am surprised by its quick startup, but very slowrunning afterwards. The same for Python tools like "hg view" or "hgtk".

The EPLF guys have a funny trick in their Scala TTY loop to make it appearfaster on startup as it is, but I have no plans to imitate thisillusionistic approach.



        Makarius
_______________________________________________
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev

Re: [isabelle-dev] mira disk usage

Reply via email to