On Sat, 1 Sep 2012, Lars Noschinski wrote:
On 01.09.2012 01:15, Gerwin Klein wrote:
I guess the problem is that build -b produces heap images for all sessions.
These are many, and each image is hundreds of MB. The previous setup had
images for selected base sessions only and no images for the rest (almost
everything).
The "AFP" session in mira does only produce the base images, AFAICS:
$ ls AFP_c5dd6e9db74c4e9fadb373ec946c413a/polyml-5.4.1_x86_64-linux/
Collections HOL-Nominal Jinja Pure
Group-Ring-Module HOL-Probability LatticeProperties Refine_Monadic
HOL HOL-Word List-Infinite Simpl
HOL-Multivariate_Analysis HOLCF Nat-Interval-Logic log
ps: "isabelle build" takes about 5 sec on my system to figure out
dependencies (I think). Should this be faster?
Isabelle build produces only the required images, i.e. "inner" nodes of
the hierarchy. A more ambitious version would not even do that, forking
processes directly without going through image save/load first.
In the mira configuration
http://isabelle.in.tum.de/repos/isabelle/file/7b6beb7e99c1/Admin/mira.py I
don't see redundant -b options, in agreement with the above observation of
images found later in the file-system.
I guess this time is spent hashing the dependencies to figure out,
whether they have changed -- at least, it is not time based.
I know that Git uses stat info as a first step and only if this changes
proceeds reading the file.
Ever since I am taking care of theory loading and session management
(approx. 1997) I am aware of the various ways how file identifies are
managed, including their advantages and disadvantages. The "stat" model
is tied to a physical file-system, and was gradually phased from Isabelle
in the past couple of years. SHA1 computation on the whole file content
is fast enough, and has the advantage that it is well-defined for
non-local files, say via HTTP. Or files that are not files at all, just a
vector of characters sent from the front-end to the back-end.
The slowness of the preparatory stage of Isabelle build has a different
reason: reworking the old 'use' command to fit into the IDE model (without
a file-system present) I've realized that the prover can now tell about
auxiliary file-dependencies, via the new 'ML_file' command. This requires
a full outer-parse of files that might contain such commands: it also
explains the renaming from unspecific 'use' to specific 'ML_file', because
the latter does not occur by accident in source files so often.
This is a significant conceptual improvement, so the question if it is
possible to do this superficial parsing a bit faster can be sorted out in
the coming months. It might be just a matter to use some of Oderski's
.par combinators.
It also indicates how huge the HOL image has now become, And how great
Poly/ML 5.5.0 is at digesting that so quickly.
Generally, Scala/JVM technology has a tradition of slow startup, but once
it is running it runs quite fast. When I start Proof General Emacs now
and again by accident, I am surprised by its quick startup, but very slow
running afterwards. The same for Python tools like "hg view" or "hgtk".
The EPLF guys have a funny trick in their Scala TTY loop to make it appear
faster on startup as it is, but I have no plans to imitate this
illusionistic approach.
Makarius
_______________________________________________
isabelle-dev mailing list
isabelle-...@in.tum.de
https://mailmanbroy.informatik.tu-muenchen.de/mailman/listinfo/isabelle-dev