metacircular Bicicleta-language interpreter

Kragen Javier Sitaker Sat, 24 Feb 2007 07:35:24 -0800

Alan Kay frequently expresses enthusiasm over the metacircular Lisp
interpreter in the Lisp 1.5 Programmer's Manual.  For example, in
http://acmqueue.com/modules.php?name=Content&pa=showpage&pid=273&page=4
he writes:


    Yes, that was the big revelation to me when I was in graduate
    school --- when I finally understood that the half page of code on
    the bottom of page 13 of the Lisp 1.5 manual was Lisp in
    itself. These were "Maxwell's Equations of Software!" This is the
    whole world of programming in a few lines that I can put my hand
    over.

    I realized that anytime I want to know what I'm doing, I can just
    write down the kernel of this thing in a half page and it’s not
    going to lose any power. In fact, it’s going to gain power by
    being able to reenter itself much more readily than most systems
    done the other way can possibly do.

So I thought I'd look up the interpreter and see what it looked like.
Here's my transcription of the code on the bottom of page 13:

evalquote[fn;x] = apply[fn;x;NIL]
apply[fn;x;a] =
     [atom[fn] -> [eq[fn;CAR] -> caar[x];
                   eq[fn;CDR] -> cdar[x];
                   eq[fn;CONS] -> cons[car[x];cadr[x]];
                   eq[fn;ATOM] -> atom[car[x]];
                   eq[fn;EQ] -> eq[car[x];cadr[x]];
                   T -> apply[eval[fn;a];x;a]];
      eq[car[fn];LAMBDA] -> eval[caddr[fn];pairlis[cadr[fn];x;a]];
      eq[car[fn];LABEL] -> apply[caddr[fn];x;cons[cons[cadr[fn];
                                                  caddr[fn]];a]]]
eval[e;a] = [atom[e] -> cdr[assoc[e;a]];
      atom[car[e]] -> 
              [eq[car[e],QUOTE] -> cadr[e];
              eq[car[e];COND] -> evcon[cdr[e];a];
              T -> apply[car[e];evlis[cdr[e];a];a]];
      T -> apply[car[e];evlis[cdr[e];a];a]]
evcon[c;a] = [eval[caar[c];a] -> eval[cadar[c];a];
              T -> evcon[cdr[c];a]]
evlist[m;a] = [null[m] -> NIL;
               T -> cons[eval[car[m];a];evlis[cdr[m];a]]]

Note that this refers to the following functions that it does not
define: caar, cdar, cadr, caddr, pairlis, and assoc.  pairlis and
assoc are from p.12:
pairlis[x;y;a] = [null[x]->a;T->cons[cons[car[x];car[y]];
                 pairlis[cdr[x];cdr[y];a]]]
assoc[x;a] = [equal[caar[a];x]->car[a];T->assoc[x;cdr[a]]]

This definition of assoc depends on equal, but it's perfectly adequate
to use eq for this purpose.  But if you want, you can use the
definition from p.10:

equal[x;y]=[atom[x]->[atom[y]->eq[x;y]; T->F];
                equal[car[x];car[y]]->equal[cdr[x];cdr[y]];
                T->F]

Explicit definitions of the other accessors are as follows:

caar[x] = car[car[x]]
cdar[x] = cdr[car[x]]
cadr[x] = car[cdr[x]]
caddr[x] = car[cdr[cdr[x]]]

That gives us a full metacircular interpreter.  It's 28 lines of code
and 1222 characters, which definitely qualifies as "half a page" in my
book.  It glosses over the following details (inheriting them from the
host implementation):
- atom comparison (eq)
- memory management
- car, cdr, cons
- control flow, in the form of conditional expressions and otherwise
  eager evaluation
- parsing
- type checking and type testing (atom)
- recursive function call and return
  - it inherits the tail-recursion optimization as well

It also omits error handling, efficient name lookup, the ability to
define names at the top-level, and other intrinsics, like arithmetic
and I/O.

So I wrote a metacircular interpreter for the language of Bicicleta.
It's a bit longer at 47 lines.  I haven't tested it, so it might have
some stupid bug in it.

metacircular_bicicleta_interpreter = {top:
  # looking up names in environments; phi is the empty environment.
  phi = {e: 
    get = {g: key = "foo",
      '()' = globals.error(err="key not found", what=g.key) }
    add = {a: key = "bar", value = "foo",
      '()' = top.phi {n: key = a.key, value = a.value, next = e,
        get = {g: key = "foo", '()' = (g.key == n.key).if_true(
          then = n.value, else = n.next.get(key = g.key))}}}}
  # Five types of expression: names, method calls, override or
  # "derivation" expressions, object literals, and constants.  I'm
  # leaving out constants for now.
  name = {v: name = "x", eval = {e: env=top.phi, '()'=e.env.get(key=v.name)}}
  call = {c: object = top.name, method_name = "walnuts",
    eval = top.name.eval {e:
      object = c.object.eval(env = e.env)
      '()' = e.object.get(key=c.method_name).apply(self=e.object)}}
  literal = {l: self = "self", methods = top.no_defs,
    eval = top.name.eval {e:
      '()' = l.methods.bind(base=top.proto_object, env=e.env, self=l.self)}}
  derivation = {d: object = top.name, methods = top.no_defs, self = "self",
    eval = top.call.eval {e:
      '()' = d.methods.bind(base=e.object, env=e.env, self=d.self)}}
  # Derivation and object-literal expressions contain lists of method 
  # definitions:
  no_defs = {o: bind = {b: base = top.proto_object, '()' = b.base } }
  definition = top.no_defs {d: name="foobar", body=top.name, next=top.no_defs,
    bind = {b: base = top.proto_object, env = top.phi, self = "self",
      '()' = d.next.bind(
        base = b.base.derive(name=d.name, body=d.body, self=b.self, env=b.env)
        env = b.env
        self = b.self)}}
  proto_object = {o:
    get = {g: key = "quux",
      '()' = globals.error(err="method not found", what=g.key)}
    derive = {d: name="quux", self="self", body=top.name, env=top.phi,
        '()' = top.proto_object {m: name=d.name, self=d.self, body=d.body,
          env=d.env, next=o,
          get = {g: key = "quux", '()' = (g.key == m.name).if_true(
            then = m, else = m.next.get(key = g.key))}
          apply = top.apply { method = m }}}}
  apply = {a:
    self = top.proto_object
    method = top.proto_object.derive()
    '()' = a.method.body.eval(
      env = a.method.env.add(key = a.method.self, value = a.self))}
}

Most of the additional text consists of example values like 'key =
"quux"', which are there for concreteness.  If we omit them and
comments, and inline apply instead of putting it at the top level, we
get a shorter version:

terse_metacircular_interpreter = {top:
  phi = {e: 
    get = {g: '()' = globals.error(err="key not found", what=g.key) }
    add = {a: '()' = top.phi {n: key = a.key, value = a.value, next = e,
      get = {g: '()' = (g.key == n.key).if_true(
        then = n.value, else = n.next.get(key = g.key))}}}}
  name = {v: eval = {e: '()'=e.env.get(key=v.name)}}
  call = {c: eval = top.name.eval {e:
      object = c.object.eval(env = e.env)
      '()' = e.object.get(key=c.method_name).apply(self=e.object)}}
  literal = {l: eval = top.name.eval {e:
      '()' = l.methods.bind(base=top.proto_object, env=e.env, self=l.self)}}
  derivation = {d: eval = top.call.eval {e:
      '()' = d.methods.bind(base=e.object, env=e.env, self=d.self)}}
  no_defs = {o: bind = {b: '()' = b.base }}
  definition = top.no_defs {d: bind = {b: '()' = d.next.bind(
        base = b.base.derive(name=d.name, self=b.self, body=d.body, env=b.env)
        env = b.env
        self = b.self)}}
  proto_object = {o: get = {g:
      '()' = globals.error(err="method not found", what=g.key)}
    derive = {d: '()' = top.proto_object {m: name=d.name, self=d.self, 
      body=d.body, env=d.env, next=o,
      get = {g: '()' = (g.key == m.name).if_true(
        then=m, else=m.next.get(key = g.key))}
      apply = {a: '()'=m.body.eval(env=m.env.add(key=m.self, value=a.self))}}}}}

That's 26 lines, 1349 characters.  I'm not sure which version is
easier to understand.  It would be possible to shorten it further if
it used positional arguments; as it is, it uses named arguments
throughout.  (Also, 'get' is duplicated.)

You'll note that these are lexically scoped (evaluation of derivation
expressions uses the environment) rather than dynamically scoped
(apply uses the environment) as in the Lisp 1.5 case.

These metacircular interpreters of Bicicleta leave the following undefined:
- evaluation of constants;
- '==' on strings --- it has to be something with an 'if_true' method;
- globals.error, although this could simply be omitted as in the Lisp version;
- the '()' syntactic sugar (foo(bar) means foo{bar}.'()'); I assume
  the parser takes care of this.

In addition, it glosses over most of the rest of the full Lisp list:
- memory management
- control flow, this time in the form of lazy evaluation
- parsing
- recursive function call and return
  - it inherits the tail-recursion optimization as well

There's a bit of an efficiency cheat; this metacircular interpreter
will do redundant work in cases where the host implementation would
not, in the sense that calling the same method on the same object
repeatedly will recompute its value every time instead of simply
returning the previous result.  This means that pretty much any
interesting algorithm becomes exponential-time.  I am thinking that I
can write a version that does not have this problem.

What's interesting is the list of things the Lisp metacircular
interpreter glosses over that this interpreter doesn't:
- car, cdr, cons: these are replaced by, essentially, combinations of
  functions;
- conditional expressions: these are replaced by lazy evaluation and
  dynamic dispatch
- type checking and type testing (atom): these are replaced by dynamic
  dispatch

On the Representation of Expressions
------------------------------------

Expressions consist of four kinds of expression objects, each with an
'eval' method (name, call, literal, and derivation), and method
definition lists that go inside two of the kinds of expressions,
represented by the 'no_defs' and 'definition' objects.

'definition' objects form a linked list terminated by a 'no_defs'
object.  Definition lists have a single interesting method, 'bind',
which constructs objects; you pass it the self-name, the object to
derive from, and the environment in which the definitions are
evaluated.

In my earlier Bicicleta programs, and in Abadí and Cardelli's work,
the self-name is textually a part of each method definition --- a
method might read (in their notation) "x = sigma(self) self.y" or (in
mine) "self.x = self.y".  I concluded that it was better to put the
self-name at the beginning of the list of methods, and that is
reflected in the expression representation: 'literal' and 'derivation'
objects have a 'self' slot that contains the name used by all the
method definitions they contain.

There is no code in the interpreter for constructing these
expressions, but you can do it explicitly.  Here's my attempt to
hand-compile "{v: eval = {e: env=top.phi, '()'=e.env.get(key=v.name)}}", 
which is what the definition of "name" looks like if you leave out the
literal string.  I'm assuming that this is evaluated in an environment
where "m" contains the metacircular interpreter bits given here.

m.literal { self = "v", methods = m.definition { name = "eval",
  body=m.literal { self="e", methods = m.definition { name = "env",
    body = m.call { object = m.name { name = "top" }, method_name = "phi" },
    next = m.definition { name = "()", body = m.call { method_name = '()',
      object = m.derivation { self = "unused name", object = m.call {
          method_name = 'get', object = m.call { method_name = 'env',
            object = m.name { name = "e" } } }, 
        methods = m.definition { name = "key", 
          body = m.call { object = m.name { name = "v" }, method_name = "name" 
} }
      } } } } } } }

Grammar
-------

No.  Later.

Getting Laziness Back
---------------------

The previous versions of the metacircular interpreter will re-evaluate
the same method multiple times if its value is needed multiple times,
even if they are running on a graph-reduction implementation that does
not have this problem.  Here's a version that doesn't do that:

# external dependencies: string.'==', string.'=='.'()'.if_true, globals.error
memoizing_bicicleta_interpreter = {top:
  # looking up names in environments; phi is the empty environment.
  phi = {e: 
    get = {g: key = "foo",
      '()' = globals.error(err="key not found", what=g.key) }
    add = {a: key = "bar", value = "foo",
      '()' = top.phi {n: key = a.key, value = a.value, next = e,
        get = {g: key = "foo", '()' = (g.key == n.key).if_true(
          then = n.value, else = n.next.get(key = g.key))}}}}
  # Five types of expression: names, method calls, override or
  # "derivation" expressions, object literals, and constants.  I'm
  # leaving out constants for now.
  name = {v: name = "x", eval = {e: env=top.phi, '()'=e.env.get(key=v.name)}}
  call = {c: object = top.name, method_name = "walnuts",
    eval = top.name.eval {e:
      object = c.object.eval(env = e.env)
      '()' = e.object.properties().get(key=c.method_name)}}
  literal = {l: self = "self", methods = top.no_defs,
    eval = top.name.eval {e:
      '()' = l.methods.bind(base=top.proto_object, env=e.env, self=l.self)}}
  derivation = {d: object = top.name, methods = top.no_defs, self = "self",
    eval = top.call.eval {e:
      '()' = d.methods.bind(base=e.object, env=e.env, self=d.self)}}
  # Derivation and object-literal expressions contain lists of method 
  # definitions:
  no_defs = {o: bind = {b: base = top.proto_object, '()' = b.base } }
  definition = top.no_defs {d: name="foobar", body=top.name, next=top.no_defs,
    bind = {b: base = top.proto_object, env = top.phi, self = "self",
      '()' = d.next.bind(
        base = b.base.derive(name=d.name, body=d.body, self=b.self, env=b.env)
        env = b.env
        self = b.self)}}
  proto_object = {o:
    get = {g: key = "quux",
      '()' = globals.error(err="method not found", what=g.key)}
    properties = { '()' = top.phi }
    derive = {d: name="quux", self="self", body=top.name, env=top.phi,
        '()' = top.proto_object {m: name=d.name, self=d.self, body=d.body,
          env=d.env, next=o,
          get = {g: key = "quux", '()' = (g.key == m.name).if_true(
            then = m, else = m.next.get(key = g.key))}
          properties = {a: self = m,
            '()' = m.next.properties(self=a.self).add(
               key=m.name, value=m.apply(self=a.self))}
          apply = top.apply { method = m }}}}
  apply = {a:
    self = top.proto_object
    method = top.proto_object.derive()
    '()' = a.method.body.eval(
      env = a.method.env.add(key = a.method.self, value = a.self))}
}

Essentially, an object is represented as a linked list of method
definitions.  Each method definition contains its name, the name it
binds to refer to the object it's called on, the expression containing
its body (in a more sophisticated system, perhaps the body would be
represented as compiled code rather than just an expression), and the
environment in which it was defined.  The linked list is terminated by
'proto_object', the ancestor of all objects, which has no methods.

There are two things you can do with an object: you can get a method
definition out of it (to apply to the object or to another object
derived from it) or you can get one of its properties --- the result
of applying the method to the object itself.  These are implemented by
the 'get' and 'properties' methods.

I probably should have used the alist implemented by 'phi' to hold the
methods of an object, and then wrapped it in some kind of 'object'
object that implements 'get' and 'properties', but instead the methods
are directly linked to each other; so there's no distinction between a
method and the object whose last-added method is that method.  This
may be confusing.

Since you can get the method out of the object and then apply it to
the object, you don't really need the 'properties' method.  I added it
so that there would be a place to store previously-computed results so
they could be fetched instead of recomputed.  'properties' has a '()'
method that returns an alist of all the properties of the object.

The 'properties' method recurses down the list of methods,
accumulating an alist.  This implementation is wasteful in two ways:
- If you have an object with ten methods, each method has a separate
  'properties()' alist.  Since the language is lazy, this isn't
  terribly inefficient.
- If you override the same method ten times, the 'properties()' alist
  will contain ten values for the method, most of which will be
  nonsense and never used.  As long as the language is
  side-effect-free, this isn't a horrible problem, but it does affect
  efficiency.  But it's probably better to use a hash table or
  something anyway, reserving the pure alist mechanism for pedagogical
  explanations, proofs, and the like.

The theory is that if you have some object "walnuts" and you want its
property "cherries", then if you write
'walnuts.get(key="cherries").apply(self=walnuts)', this reduces to
something like "walnuts.next.next.next.apply { self=walnuts }.'()'",
which involves an override and so would require a fairly clever
interpreter to cache, while if you say
'walnuts.properties().get(key="cherries")', this can reduce merely to
"walnuts.properties.'()'.next.next.next.value", which is merely
navigation into a lazily constructed tree of objects.  While the next
time you evaluate the same expression, get will still have to walk
down the list of properties to find the right one, when it finds it,
it won't have to go through the cache-busting process of overriding
stuff again.

Library Inheritance
-------------------

The metacircular_bicicleta_interpreter above is not a class in the
conventional OO sense --- it contains several data types and some
functions.  But we can still inherit from it.  For example, we could
have written the memoizing_bicicleta_interpreter as follows instead:

memoizing_bicicleta_interpreter = foo.metacircular_bicicleta_interpreter {top:
  nonmemoizing = foo.metacircular_bicicleta_interpreter
  call = nonmemoizing.call {c: eval = nonmemoizing.call.eval {e:
    '()' = e.object.properties().get(key=c.method_name)}}
  proto_object = nonmemoizing.proto_object {o:
    properties = { '()' = top.phi }
    derive = nonmemoizing.proto_object.derive {d:
      '()' = nonmemoizing.proto_object.derive.'()' {m:
        properties = {a: self = m,
          '()' = m.next.properties(self=a.self).add(
            key=m.name, value=m.apply(self=a.self))}}}
}

(There's probably a bug in there somewhere.  I hope it doesn't make
the code incomprehensible.)

It's interesting that this formalism allows for the memoizing
functionality to be added in a relatively orthogonal way, with
relatively few lines of code.  DeLesley Hutchins, in his Ohmu paper at
OOPSLA a few years ago, argued that a language with this kind of
structure allows aspect-oriented programming to be integrated cleanly
with the rest of the language, rather than treated as an add-on.  I'm
not sure my grasp of AOP is firm enough to assent or dissent.

Another point of view is that this allows a cleaner expression for
functionality that's most cleanly divided along functional, rather
than object, lines.  For example, with the same technique, we could
add a 'map' method to (a copy of) 'phi' and 'phi.add', as if we were
programming in ML or CLOS rather than a more conventional
object-oriented language.

Hutchins also suggests that this pervasive ability to override could
be used to integrate version control into the language --- older
versions of things might merely be their superclasses.  I'm not sure
if this is a good idea, but it might be.

More experience will be needed before we know whether this kind of
program structure is generally a good idea or not.  It may be that it
impedes understanding more than it aids it.  It certainly provides
more flexibility in program structure.

At times I've thought it would be better if you could write this:
    call.eval.'()' = e.object.properties().get(key=c.method_name)
rather than the deeply nested expression above.  Clearly this needs someplace
to put the names 'e' and 'c'.

metacircular Bicicleta-language interpreter

Reply via email to