[greg.alt@gmail.com: Re: bootstrapping from ECHO (SPOILERS)]

Kragen Javier Sitaker Fri, 29 Apr 2011 11:47:59 -0700

Forwarded since Greg's copy presumably bounced.

----- Forwarded message from Greg Alt <greg....@gmail.com> -----

Subject: Re: bootstrapping from ECHO (SPOILERS)
From: Greg Alt <greg....@gmail.com>
To: Kragen Javier Sitaker <kra...@canonical.org>
Cc: Dave Long <dave.l...@bluewin.ch>, kragen-disc...@lists.canonical.org

On Thu, Apr 28, 2011 at 3:54 PM, Kragen Javier Sitaker
<kra...@canonical.org> wrote:
>> Btw, there are some bugs in the highest level version of my compiler
>> that make it choke on compiling complex code in scheme4 (such as a
>> scheme4 compiler written in scheme4, which was my next goal).  Mostly
>> the issue is that I hadn't implemented garbage collection yet and the
>> way I compile nested if statements ended up consing way too much.
>
> How much is way too much?  Is it exponential or something?  That might
> cause difficulty with execution time even once you have garbage
> collection implemented.

Part of it is that I'm using only a small portion of 64k for my heap.
And part of it is that my first stab at compiling if's actually
compiles the true and false clauses twice each - once to get the size
and once to emit the machine code.  And both cases end up consing the
same amount.

So if you do something like:

(if (eq x 1) (do1)
  (if (eq x 2) (do2)
    (if (eq x 3) (do3)
      (if (eq x 4) (do4)
         (if (eq x 5) (do5) (do6))))))

(do1) gets compiled twice, (do2) gets compiled 4 times, ... and (do6)
gets compiled 64 times.  Since compiling anything ends up consing up a
list of bytes to emit, if (do6) is an expression that ends up taking
256 bytes, it ends up consing 16k cons cells ignoring all the other
overhead, and cons cells are 4 bytes, so that's already 64k which is
way over my limit.  :)

Changing all that around to cons less would be easy in scheme4, but is
very tedious in scheme3 - and I can't have a working scheme4 compiler
written in scheme4 until I get my scheme4 compiler that's written in
scheme3 working.

The joys of bootstrapping... :)

> In http://canonical.org/~kragen/sw/urscheme/ I still haven't implemented
> the garbage collector.  But I think it only allocates a meg or two of
> memory before it finishes compiling itself, which isn't a big deal on
> modern machines.  It might be a problem in MS-DOS, though.

Especially 64k .com files :)

>> Implementing garbage collection in scheme3, though, turned out to be
>> quite a slog.  It works, but still seems to choke at a later point,
>> and I'm not sure why - and as you can imagine debugging this stuff
>> without a proper debugger and error messages is slow going.
>
> I found debugging Ur-Scheme was aided substantially by having GDB handy.
> I didn't resort to it often, though, because the emitted code is full of
> run-time type-checks, which were usually what failed.  And being able to
> hit F5 in Emacs and immediately see the diff of the emitted assembly
> code, followed by any test failures, was a big help.

My goal was to start from as scratch as possible and cheat as little
as possible.  I ended up using debug.com to avoid wasting lots of time
just to stick to my principles, but anything fancier just didn't seem
right.

> StoneKnifeForth <https://github.com/kragen/stoneknifeforth> may be of
> some interest to you.  I need to fix its ELF header to be able to run on
> current Linuxes.  Its set of 18 primitives (or 20, or 21, depending on
> how you count) are sufficient to implement it, but need to be extended a
> bit to support other programs.  I suspect that you will find that
> compiling Scheme to a set of stack-based primitives is easier than
> compiling it to a set of register-based priitives.

I ended up doing things mostly stack-based, but mostly sticking to the
principle you discovered, "Start with the simplest thing that could
possibly work", so it's all ugly hacks just meant to limp across the
finish line to the next higher-level language that would be easy to
work with.

>> It's definitely been a fascinating project.  My goal was to learn some
>> higher level abstract concepts about how to think about bootstrapping
>> in general - even apart from computers, and I think I've learned a
>> lot.
>
> Can you articulate any of the things you've learned about bootstrapping
> apart from computers?

Sure.  When I start fun projects, I like to double-up the fun.  A
while back I wrote a real-time raytracer in SBCL common lisp.  I was
interested in real-time raytracing and also was intrigued by the idea
of writing super-optimized common lisp, so I combined both objectives
in one project.  Later I did a project using the OpenCV computer
vision library in python to try to take pictures of a go board with go
stones on it in mid-game and generate a .sgf file with the stone
positions - that let me accomplish three objectives at once.  So for
this project, I decided to quadruple the fun - learning 8086 assembly,
learn the ins and outs of DOS and .com files, making a lisp compiler,
and learning how to think in a bootstrappy way.

So as for general bootstrapping concepts, outside of computers, I've
been thinking about how to grow organizations of people in a way that
is self-sustaining, also starting from scratch.  A lot of similar
things apply.  As an engineer, I really want to fully spec out my end
goal and then methodically put the pieces together until I'm done, but
that's not at all what you want to do when you bootstrap anything.
With a compiler, you'd spend decades hand-typing pages and pages of
hex code, manually calculating jumps and other nonsense.

With an organization, you'd spend way too much time up front planning
the end goal, and then you'd try to somehow recruit people one at a
time to fill all the slots you created - with the idea that once all
the slots were filled, you'd be able to start doing something.

Instead, I hit on an idea of broad-strokes planning.  You look at your
ideal end-goal and where you are at.  And since it's clearly not
possible to get from here to there in one effort, you break it down
logarithmically.  So I tried to imagine working backwards to the
simplest possible lisp language that could be used to implement a full
scheme.  Or similarly, an organization of 200 people that could create
an organization of 2000.  That's fine and all, and you can start to
spec things out a bit, but it's still too big of a jump.  So the next
step was to REALLY scale things back to the bare minimum you could
possibly imagine that could take you to the middle step.  Then from
there, you work both backwards and forwards to find your first step,
finding a balance between the simplest thing you can possibly do vs.
the step that makes your life easier in the immediate future.

Then there's wrapping your head around the concept of first you
implement stage N using the N-1 stage language (or organization).
Once that works, you stabilize where you're at by implementing stage N
using your stage N implementation built from stage N-1.  Then you
launch forward again to N+1, and so on.

And quite often, you find yourself overreaching, trying to do too
much.  In this way, your stage N implemented with stage N-1 never
materializes and you need to fall back to stage N-1 and rethink stage
N to be less ambitious.

Before I started, this bootstrappy way of thinking tied my brain in
knots with even relatively simple problems.  Now it's starting to make
sense.  And I like the way you can get boxed in a corner and just
imagine some magic to unblock you, and if it's simple enough magic and
it works, you feel like you're getting away with something that
shouldn't be possible, like you're using your first wish to ask the
genie for 5 more wishes.

Greg

----- End forwarded message -----
-- 
To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-discuss

[greg.alt@gmail.com: Re: bootstrapping from ECHO (SPOILERS)]

Reply via email to