Devs, I'm trying to understand how the core2core pipeline works. Sadly, we don't have a wiki page about this so the "only" source of information are the papers and the source code. Papers give pretty much detail about each transformation in separate but none of the papers gives a comprehensive and up-to-date overview of how the whole pipeline is structured. My questions are based on reading the user documentation and the following papers:
[1] - "The Glasgow Haskell Compiler" from The Architecture of Open Source Application, vol. 2 [2] - "Compilation by Transformation in Non-Strict Functional Languages", PhD by Santos [3] - "Secrets of the Glasgow Haskell Compiler inliner" [4] - "A transformation-based optimiser for Haskell" [5] - "Modular, Higher-Order Cardinality Analysis in Theory and Practice" [6] - "Let-floatig: moving bindings to give faster programs" [7] - "Playing by the Rules: Rewriting as a practical optimisation technique in GHC" I know there are several papers missing from this list, eg. "Constructed Product Result Analysis for Haskell" or "Call-pattern specialisation for Haskell programs". The reason is that these optimisations are beyond the scope of what I'm doing at the moment (or so I believe). This mail basically asks just one question: what is the order of optimizations pefromed on Core? Since this question is pretty big and general I've separated it into smaller questions that arose from reading the above papers, documentation, and experimenting with GHC. Now the detailed questions: 1. What is the difference between a "simplifier iteration" and "simplifier phase"? Section 7.20.6.5 of the user guide mentions phases but I believe that iterations are not explained anywhere. My best guess, expressed in pseudo-code, is this (sorry about the imperative style): foreach (i in iterations) { // some optimisations here? foreach (p in phases) { //...optimisations here } // some optimisations here? } 1a. What is the default maximum iterations count? User documentation does not specify that. 2. How can I observe the effects of `-ddump-simpl-phases`. I tried compiling several different programs and this flag seems to have no effect (ie. nothing gets printed). 3. Cardinality anlaysis and inlining: cardinality analysis can determine that a let binding is used exactly once. Can the inliner re-use this information from the cardinality analysis or does it recompute it per [3], section 3.1? 4. I've compiled a sample program using `-dverbose-core2core` and got the following phases: - Desugar (after optimization) - Simplifier (Phase = InitialPhase [Gentle]) - Specialise - Levels added - Float out - Float inwards - Simplifier (Phase = 2 [main]) - Simplifier (Phase = 1 [main]) - Simplifier (Phase = 0 [main]) - Demand analysis - Worker Wrapper binds - Simplifier (Phase = 0 [post-worker-wrapper]) - Levels added - Float out - Common sub-expression - Float inwards - Simplifier (Phase = 0 [final]) - Tidy Core - CorePrep This raises lots of questions: 4a. The first phase is "Desugar (after optimization)". What optimizations are performed during desugaring? 4b. I'm not sure whether I'm looking at a single iteration of core2core transformation or at multiple ones. Some passes are performed several times (Float out, Float inwards), which suggests that there might be many iterations here. On the other hand simplifier phases are decreasing towards 0, which looks as if it was one core2core iteration. My assumption here is that every time a new core2core iteration starts the simplifier phases are counted anew from 2 towards 0. Is that correct? 4c. Why are there several 0 phases of the Simplifier? I find it confusing. 4d. I understand that some passes can be enabled or disabled using command-line options. Can the decission to run some passes be made dynamically by the compiler (eg. to run extra simplifier passes)? 4e. Are there more phases that could appear here, ie. they were ommited with -O? 4f. "Levels added" pass before the "Float out" pass: my guess is that this is preparation for the full laziness transform. So, is full laziness performed as part of "Float out" pass? A general note is that I am confused by many Simplifier passes being interleaved with other passes. I expected that simplifier phases will grouped into a single pass, as speculated in question 1. 5. What optimizations *exactly* are performed by the Simplifier? I assume that most of what's described in chapter 3 of [2]: beta reduction, let elimination, case elimination, case floating, constant folding and eta expansion. I'm not sure about floating let outwards and inwards - [1], pg. 7, says these are in a pass separate from the simplifier. `-dverbose-core2core` seems to confirm that since it reveals separate "Float out" and "Float inwards" passes. 6. [4], pg. 31, mentions the Deforestation optimisation. Is everything described in that "Deforestation" section subsumed by cardinality analysis ([5], end of section 2.1 and section 7.1)? If not then when is deforestation performed? 7. [5], section 6.1 says: "We run the analysis twice, once in the middle of the optimisation pipeline and once near the end". When exactly in the middle of the pipeline? Between which passes? This does not show up with `-dverbose-core2core` (or at least it is not explicitly named). 8. How does the rules rewriting fit into the picture? Section 7.20.6.5 of the User Guide and section 4.1 of [7] explain the interaction between rules and inlining and my guess is that both are performed by the Simplifier. Again the "simplifier phases/iterations" distinction puzzles me as to what exactly is happening when. Within a single phase is the inlining happening before rewriting or vice versa? I know that all of the above questions can be answered by looking at the source code for sufficiently long. This is actually what I'm planning to do next but if anyone could help me by answering some of these questions this would certainly save me some time. My plan is to gather up the answers on a wiki page. Janek _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs