Devs,
I'm trying to understand how the core2core pipeline works. Sadly, we don't have
a wiki page about
this so the "only" source of information are the papers and the source code.
Papers give pretty
much detail about each transformation in separate but none of the papers gives
a comprehensive
and up-to-date overview of how the whole pipeline is structured. My questions
are based on
reading the user documentation and the following papers:
[1] - "The Glasgow Haskell Compiler" from The Architecture of Open Source
Application, vol. 2
[2] - "Compilation by Transformation in Non-Strict Functional Languages", PhD
by Santos
[3] - "Secrets of the Glasgow Haskell Compiler inliner"
[4] - "A transformation-based optimiser for Haskell"
[5] - "Modular, Higher-Order Cardinality Analysis in Theory and Practice"
[6] - "Let-floatig: moving bindings to give faster programs"
[7] - "Playing by the Rules: Rewriting as a practical optimisation technique in
GHC"
I know there are several papers missing from this list, eg. "Constructed
Product Result Analysis
for Haskell" or "Call-pattern specialisation for Haskell programs". The reason
is that these
optimisations are beyond the scope of what I'm doing at the moment (or so I
believe).
This mail basically asks just one question: what is the order of optimizations
pefromed on Core?
Since this question is pretty big and general I've separated it into smaller
questions that arose
from reading the above papers, documentation, and experimenting with GHC.
Now the detailed questions:
1. What is the difference between a "simplifier iteration" and "simplifier
phase"? Section
7.20.6.5 of the user guide mentions phases but I believe that iterations are
not explained
anywhere. My best guess, expressed in pseudo-code, is this (sorry about the
imperative style):
foreach (i in iterations) {
// some optimisations here?
foreach (p in phases) {
//...optimisations here
}
// some optimisations here?
}
1a. What is the default maximum iterations count? User documentation does not
specify that.
2. How can I observe the effects of `-ddump-simpl-phases`. I tried compiling
several different
programs and this flag seems to have no effect (ie. nothing gets printed).
3. Cardinality anlaysis and inlining: cardinality analysis can determine that a
let binding is
used exactly once. Can the inliner re-use this information from the cardinality
analysis or does
it recompute it per [3], section 3.1?
4. I've compiled a sample program using `-dverbose-core2core` and got the
following phases:
- Desugar (after optimization)
- Simplifier (Phase = InitialPhase [Gentle])
- Specialise
- Levels added
- Float out
- Float inwards
- Simplifier (Phase = 2 [main])
- Simplifier (Phase = 1 [main])
- Simplifier (Phase = 0 [main])
- Demand analysis
- Worker Wrapper binds
- Simplifier (Phase = 0 [post-worker-wrapper])
- Levels added
- Float out
- Common sub-expression
- Float inwards
- Simplifier (Phase = 0 [final])
- Tidy Core
- CorePrep
This raises lots of questions:
4a. The first phase is "Desugar (after optimization)". What optimizations are
performed during
desugaring?
4b. I'm not sure whether I'm looking at a single iteration of core2core
transformation or at
multiple ones. Some passes are performed several times (Float out, Float
inwards), which suggests
that there might be many iterations here. On the other hand simplifier phases
are decreasing
towards 0, which looks as if it was one core2core iteration. My assumption here
is that every
time a new core2core iteration starts the simplifier phases are counted anew
from 2 towards 0. Is
that correct?
4c. Why are there several 0 phases of the Simplifier? I find it confusing.
4d. I understand that some passes can be enabled or disabled using command-line
options. Can the
decission to run some passes be made dynamically by the compiler (eg. to run
extra simplifier
passes)?
4e. Are there more phases that could appear here, ie. they were ommited with -O?
4f. "Levels added" pass before the "Float out" pass: my guess is that this is
preparation for the
full laziness transform. So, is full laziness performed as part of "Float out"
pass?
A general note is that I am confused by many Simplifier passes being
interleaved with other
passes. I expected that simplifier phases will grouped into a single pass, as
speculated in
question 1.
5. What optimizations *exactly* are performed by the Simplifier? I assume that
most of what's
described in chapter 3 of [2]: beta reduction, let elimination, case
elimination, case floating,
constant folding and eta expansion. I'm not sure about floating let outwards
and inwards - [1],
pg. 7, says these are in a pass separate from the simplifier.
`-dverbose-core2core` seems to
confirm that since it reveals separate "Float out" and "Float inwards" passes.
6. [4], pg. 31, mentions the Deforestation optimisation. Is everything
described in
that "Deforestation" section subsumed by cardinality analysis ([5], end of
section 2.1 and
section 7.1)? If not then when is deforestation performed?
7. [5], section 6.1 says: "We run the analysis twice, once in the middle of the
optimisation
pipeline and once near the end". When exactly in the middle of the pipeline?
Between which
passes? This does not show up with `-dverbose-core2core` (or at least it is not
explicitly
named).
8. How does the rules rewriting fit into the picture? Section 7.20.6.5 of the
User Guide and
section 4.1 of [7] explain the interaction between rules and inlining and my
guess is that both
are performed by the Simplifier. Again the "simplifier phases/iterations"
distinction puzzles me
as to what exactly is happening when. Within a single phase is the inlining
happening before
rewriting or vice versa?
I know that all of the above questions can be answered by looking at the source
code for
sufficiently long. This is actually what I'm planning to do next but if anyone
could help me by
answering some of these questions this would certainly save me some time. My
plan is to gather up
the answers on a wiki page.
Janek
_______________________________________________
ghc-devs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/ghc-devs