Hi all, This patchset implements the CPS compiler I've been working on over the last couple months. At the end of it, you can
guild compile -t rtl foo.scm and, if there are no bugs or unimplemented bits, at that point you have a loadable .go with RTL code. (Try readelf -a on the .go!) >From the console the way to do it is: (use-modules (system vm objcode)) (load-thunk-from-memory (compile foo #:to 'rtl)) That gives you the thunk that, when called, will execute FOO. So with that intro, some more on CPS and the RTL VM. As you know we have a new register-based VM, where all values have names (slots). It's appropriate that the language that compiles to this VM also give names to all intermediate values. CPS has this property. In addition, CPS gives names to all control points, effectively giving a label to each expression. This CPS language was inspired by Andrew Kennedy's great 2007 paper, "Compiling with Continuations, Continued". In particular, continuations are /local/ to a function, so they really are basic block labels, and all values are bound to variables via continuation calls -- even constants. As a little example: (+ 1 2) Here + is a primcall, so we would get: ($letk ((kone ($kargs (oneb) ($letk ((ktwo ($kargs (two) ($continue ktail ($primcall + one two))))) ($continue ktwo ($const 2)))))) ($continue kone ($const 1))) Here all CPS language constructs are prefixed with "$". Everything else is a variable, except the + in the primcall. As you can see it is incredibly verbose. At the same time it's very simple, as there are only two kinds of terms: terms that bind continuations, and terms that call continuations. $letk binds a set of mutually recursive continuations, each one an instance of $cont. A $cont declares the name and source of a continuation, and then contains as a subterm the particular continuation instance: $kif for test continuations, $kargs for continuations that bind values, etc. $continue nodes call continuations. The expression contained in the $continue node determines the value or values that are passed to the target continuation: $const to pass a constant value, $values to pass multiple named values, etc. Additionally there is $letrec, a term that binds mutually recursive functions. The contification pass will turn $letrec into $letk if it can do so. Otherwise, the closure conversion pass will desugar $letrec into an equivalent sequence of make-closure primcalls and subsequent initializations of the captured variables of the closures. You can think of $letrec as pertaining to "high CPS", whereas later passes will only see "low CPS", which does not have $letrec. There are a bunch of Guile-specific quirks in this language, mostly related to function prologues for the different kinds of arities, and for things like multiple-value truncation and prompts. Check out (language cps) for all the deal. So, after that patch is the Tree-IL->CPS compiler. It simplifies a number of Tree-IL concepts, fixing argument order, turning toplevel references to primcalls, transforming prompts, assignment conversion, etc. For this reason it's a bit hairy, but it seems to work fine. This compiler runs *after* optimization passes on Tree-IL, so you still have peval that runs over tree-il. After that follow a bunch of passes to build up an RTL compiler. The idea is to compile by incremental source-to-source passes, and at the end you just assign slots to all variables and emit code directly. If you ever worked with the old (language tree-il compile-glil) module, you know it's very hairy, mostly because it's mixing code emission with semantic transformations. Nowhere is this more evident than the so-called "labels allocation" strategy, in which we try to allocate procedures as basic blocks, if they all return to the same place. It's a funky pass. It turns out this concept is well-studied and has a name, "contification", and is cleanly expressed as a source-to-source pass over CPS. So compile-rtl.scm is dumb: it just loops over all expressions, emitting code for each of them, and emitting jumps and shuffling registers as needed (based on a prior analysis pass). In the end you can ,disassemble the code, thanks to the earlier RTL work. I don't want people to benchmark this stuff yet -- it's buggy and not optimized. Don't use this work for anything serious. But if you're in to hacking on it, that's cool. There are a number of optimization passes that are needed (see compile-rtl.scm for a list), but currently it's looking like the RTL compiler does significantly better on loops, but struggles to keep up with the old VM for calls. This is unsurprising, as calls really do work like stacks, and a stack VM has many advantages there. But we think that with better slot allocation we can probably be faster for calls as well. Thank you, thank you, thank you to Mark Weaver for helping out with this work! Without him there would be many more bugs. I've folded all of his patches into this patchset, so please consider him the joint author of all of this. In any case the bugs have forced him to page much of this code into his head so we are better equipped as a project because of that ;-) Thank you also to Noah Lavine, who did an earlier pass at CPS conversion. I considered starting from his work but it became clear that many aspects of CPS would be nicer with changes in Tree-IL -- so I took advantage of my maintenance of the old compiler to make various changes there and in other parts of the runtime to get a cleaner CPS compiler. In the end we took advantage of his vanguard-hacking, though as ideas rather than as code. Thank you Noah! There are not as many tests and documentation as one would like. Ultimately it all just came together in the last couple of weeks, so I think the next step is to write more tests and try the compiler on new pieces of code. There are a couple bugs that we know about at this point, and surely many more that we don't know about. I ask for some lenience on this front while we figure out what the compiler should look like :) OK. WDYT, Ludo? Comments? OK to merge? :) Cheers, Andy