This is very exciting! I have a very small question, based on something I think you said earlier - since the container will be ELF, will we call our files .so now?
Noah On Fri, May 17, 2013 at 5:49 PM, Andy Wingo <wi...@pobox.com> wrote: > Friends! Schemers! Gentle Guilefolk! The time has come to begin in > earnest on the road to Guile 2.2! > > I know all of you are wondering, "did that paragraph really need four > exclamation marks?" Well the answer is yes, yes it did, and the rest of > this mail lays out the reasons. > > So, what does 2.2 mean, why is it appropriate now? 2.2 means many > things on many levels, but on its most basic level, it will result in a > new series of Guile deployments. This series will be installed in > parallel with Guile 2.0 [0]. It shouldn't disrupt people that develop > or distribute C extensions to Guile or programs written in Guile. True, > the binary named "guile" on your system might be 2.0 or 2.2, depending > on what you have installed, though one can get around that with passing > --program-suffix to ./configure. In total, the disruptive impact of a > new series is low. > > [0] > http://www.gnu.org/software/guile/manual/html_node/Parallel-Installations.html > > On the other hand, creating a new development series does have a cost in > splitting our development effort away from the stable branch, and so we > have to justify that. This is the second part of the question, "why > now?" The answer is that we have important improvements to make to the > Guile implementation that can't be made in a strictly binary-compatible > way. We think these things are so exciting as to be worth the > inconvenience of a new stable series. We might still make a few more > 2.0 maintenance releases, but when I describe what we have on tap for > 2.2 you will see why we're excited to move to the next phase. > > RTL VM > ====== > > Guile 2.0 added a compiler and a virtual machine to what was previously > a pure-interpreter implementation of Scheme. I know we all are pretty > stoked about that, but some time has passed and it's time to look at > what we have and see how we can do better. > > The Guile 2.0 VM is a stack machine. That means that its instructions > usually take their values from the stack, and produce values (if > appropriate) by pushing values onto the stack. > > The problem with stack machines is that they penalize named values. If > I realize that a computation is happening twice and I factor it out to a > variable, that means in practice that I allocate a stack frame slot to > the value. So far so good. However, to use the value, I have to emit > an instruction to fetch the value for use by some other instruction; and > to store it, I likewise have to have another instruction to do that. > > For example, in a stack machine, compiling "z = x + y" might mean: > > (local-ref 0) ; x > (local-ref 1) ; y > (add) > (local-set! 2) ; z > > The cost of interpreting a bytecode is largely dispatch cost, which is > linear in the number of instructions executed. So factoring out a > computation to a local variable can sometimes actually cause code to run > slower, depending on the relative cost of the computation versus > accessing its operands. Pushing and popping values can also be > expensive as we are updating a stack pointer all the time, possibly > checking for overflow as well. > > The solution to this problem is to use a "register machine". I use > scare quotes because in fact this is a virtual machine, so unlike a CPU, > the number of "registers" is unlimited, and in fact they are just stack > slots accessed by index. > > So in a register machine, "z = x + y" might be: > > (add 2 0 1) ; register 2 = register 0 + register 1; > > Of course the code for a register VM has to be bigger: instead of _byte_ > code that doesn't need to encode its operands because they are expected > on the stack, register VM code has to encode the "register names", so it > ends up being better to implement it as _word_ code. But this cost is > much less significant than the dispatch cost. > > Anyway, that's the back-story. For Guile 2.2, I have implemented a new > register-based virtual machine. For no good reason I called it the > "RTL" VM, for "register transfer language". RTL is a name for a > particular kind of intermediate representation in compilers, but perhaps > it isn't the most appropriate denomination for what we're doing. In any > case the name seems to have stuck. Perhaps we need a marketing guru > here to find some other compelling name :) > > Given that we're changing the bytecode... > ========================================= > > So RTL is a new bytecode, a new VM, and will ultimately need a new > compiler. I'll talk a bit more about the compiler in a minute, but I > want to mention another aspect of RTL. > > Besides being faster than the Guile 2.0 VM, RTL opens up other > interesting possibilities. One of the most exciting is the ability to > statically allocate all kinds of data. In Guile 2.0, when a .go file > loads from disk, it allocates all of the constants that it needs on the > heap. But that's sub-optimal in many ways. For example, a pair of > immediates like '(1 . 2) doesn't actually need to be allocated at all -- > it can just be part of the loaded image. Likewise a pair with > non-immediates like ("foo" . "bar") can also be allocated in the image, > but it needs fixing up at runtime so that its fields point to the "foo" > and "bar" values. In this particular case the "foo" and "bar" would be > values also in the image. > > What I mean to say is that we can, at compile time, allocate and lay out > the memory for the constants that a program will need, compute the > minimal set of initializations that those constants will need, and make > the right links from the compiled code to the constants. This is > exactly like what the C compiler, linker, and dynamic loader do -- so we > take pointers from them and do exactly the same thing. The RTL branch > writes out compiled data to ELF files. When a .go file is loaded, it > runs a thunk to do the run-time initializations, then returns the "main" > thunk for the .go (which the caller can run or not). > > Note that we use ELF for all platforms. Since we don't rely on the > platform's linker or dynamic loader, there's no sense in using the > format of each individual platform. That said, the files that Guile > produces are readable with readelf, and it should be possible to operate > on them with standard ELF tools. > > Guile's linker and loader are careful to separate read-only data like > bytecode and writable data like those pairs that need runtime > initialization. In this way we can maximize sharing between processes. > Also, Guile puts a bunch of data that's not needed in normal execution > at the end of the file, in the read-only section -- like docstrings, for > example. If a program never asks for procedure-documentation of a > procedure, those docstrings will never be paged in by the kernel; but if > a program does do a lot of documentation grovelling, they are readily > accessible. > > Separating the different metadata into different sections also makes it > possible to strip metadata. For example when deploying to systems with > not much disk space, you might strip docstrings or line number > information. You can use standard ELF tools to do this, or do it using > Guile's ELF tools. > > Finally, using a well-structured format like ELF brings in the > possibility of linking together a number of modules in one file, and > possibly even an entire application. We don't have this working yet, > and it's not in the immediate plans, but it is a great possibility to > keep in mind. > > What about the compiler? > ======================== > > So we have a new VM, assembler, disassembler, and all the things! It's > pretty close to finished: it's just missing the part that lets a > debugger know about line numbers and local variables. But how do we > actually produce RTL assembly from Scheme? > > Well, there we don't have a finished story. Noah Lavine made a branch > to compile tree-il, our high-level intermediate language, to a > "continuation-passing-style" (CPS) form. This is a useful > transformation for many reasons, but among them it is convenient for > giving a "name" to all values -- a property that a register VM prizes. > > His CPS compiler isn't done yet, though when I last looked at it, it was > looking great. So the path goes Scheme -> Tree-IL (something we already > have), then -> CPS (something Noah did), then -> RTL Assembly (again, > Noah), then -> bytecode (the piece I did). It doesn't yet work with all > of Guile but it's getting there. > > How to get there? > ================= > > However, I think we've done all we can in branches. I think we should > bless this RTL experiment as the way to do Guile 2.2. (Thoughts or > objections welcome.) To that end, I think we need to start merging > wip-rtl into master. > > What I propose is that, if we agree, I start merging in trivial stuff. > When I get to something interesting that's not just a refactoring of > existing things, I'll submit that patch to the list. We have a few days > to look at it, we go through some review, and then we figure out how to > move forward -- usually merging some version of that patch. Then we > repeat. > > Once the RTL branch is all merged in, we can start doing the same with > Noah's wip-rtl-cps branch. > > Then eventually some glorious day comes and we start using the CPS/RTL > toolchain, everything is working great and fast, and we start deleting > the old code. > > What do you all think? > > I love it when a plan comes together! > > Cheers, > > Andy > -- > http://wingolog.org/ > >