On Thu, Jun 14, 2012 at 07:27:09AM -0400, Johnicholas Hines wrote: > What would a self-interpreter for this Femto-ML look like? > Something like Mogensen's self-interpreter? > http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.89.9933
(Torben Mogensen: I am Ccing you in case you want to respond, perhaps to point to a corrected version of the paper, since some of my remarks about your paper are snarky, which is unfair, since the part of the paper I'm talking about is peripheral to its main points, to extend a sentence far past its natural lifetime. I won't be offended if you don't feel the need to respond, but I feel like it would be unfair not to give you the opportunity. Further context, if you're interested, is at <http://lists.canonical.org/pipermail/kragen-tol/2012-June/000958.html>.) Johnicolas: maybe. I'm trying to understand the self-interpreter you're referring to, impeded somewhat by the variety of errors Mogensen left in the paper, perhaps as an extra challenge to readers. To clarify the discussion, here's his syntax, from the paper: Program → TypeDeclaration∗ FunctionDeclaration+ [sic, he means FunDeclaration+] TypeDeclaration → data TypeId = ConstructorDeclaration ConstructorDeclaration → ConstructorId TypeId∗ | ConstructorId TypeId∗ | ConstructorDeclaration FunDeclaration → FunctionId VariableId∗ = Expression Expression → VariableId | FunctionId Expression∗ | ConstructorId Expression∗ | case Expression of Branch+ Branch → ConstructorId VariableId∗ => Expression I note that he seems to have forgotten parentheses, and that his grammar is in any case quite ambiguous; there's no way to tell where the VariableIds and FunctionIds in an Expression in a FunDeclaration end, and where the FunctionId and VariableIds of the following FunDeclaration begin. In his example programs, he relies on whitespace (either newlines or indentation) to make this clear to the reader. And here's the interpreter, which is much easier to understand if you read the grammar first: data num = Z | S num data univ = Con num ulist data ulist = Un | Uc univ ulist data funs = Fn | Fun exp funs data exp = Var num | Fap num elist | Cap num elist | Case exp elist data elist = En | Ec exp elist run p args = apply Z p args p apply f fs args p = case fs of Fn => Con Z Un Fun e fs => case f of Z => eval e args p S f => apply f fs args p eval e vs p = case e of Var n => lookup n vs Fap n es => apply n p (evallist es vs p) p Cap n es => Con n (evallist es vs p) Case e es => case (eval e vs p) of Con n vs1 => eval (select n es) (append vs1 vs) p evallist es vs p = case es of En => Un Ec e es => Uc (eval e vs p) (evallist es vs p) select n es = case es of En => Cap Z En Ec e es => case n of Z => e S n => select n es Morgensen remarks, "The program is represented by a list of un-names [sic] functions, which in function calls are referred to by position. Similarly, variables and constructors are represented by their positions in declarations. Type declarations are not explicitly represented in the interpreter. The self-interpreter is shown in figure 4. Missing from this are definitions of `lookup`, `append`, `hd` and `tl`, which we have omitted for reasons of space." The missing `append` is relatively straightforward, once you figure out which of the redundant list types it needs to be: append vs1 vs = case vs1 of Un => vs Uc v vs2 => Uc v (append vs2 vs) `lookup` is presumably exactly the same as `select` but with a different list type: lookup n vs = case es of Un => Cap Z En Uc v vs => case n of Z => v S n => select n vs `hd` and `tl` are not actually used in the above code, so I have omitted them. The y do get used later in the paper. It's a bit of a cheat to call this a "self-interpreter", because, in his enthusiasm to write an interpreter that neither provides nor requires equality testing, he has written an interpreter that requires a representation of the program fairly far removed from the abstract syntax tree produced by a parser. All the variables must be replaced with de Bruijn indices, and all of the case-expressions must be padded out with the addition of extra expressions (perhaps `Cap Z En`) to cover the constructors for which the program does not specify an expression. Since the compiler needed to do these transformations from the AST is comparable in size to the interpreter, perhaps the interpreter should instead be called a "virtual machine". And the fact that it's written in the language it interprets is entirely irrelevant to the interesting parts of the paper. Mogensen's miniature language doesn't include nested patterns, equality tests in the pattern-matcher, or higher-order functions (or, consequently, closures or currying), so I suspect that a FeML metacircular interpreter would probably be substantially more code, even though the language is syntactically simpler. However, like Mogensen's example, it could omit type-checking entirely. Kragen -- To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-discuss
