http://pobox.com/~kragen/sw/sk.html
So, after a couple more days of work, this thing is now a lambda-calculus interpreter --- it translates lambda-calculus expressions into SKI-combinator expressions, then does lazy graph-reduction on the result. It's doing around 500-1000 reductions per second, which adds up to 20-40 iterations per second of simple-ish loops like 'fib' or 'factorial'. It has some obvious disadvantages: - the lambda-calculus is still kind of awkward to program in directly, since once I have more than two or three arguments, I rapidly lose track of positional correspondences; - 40 iterations per second is painfully slow; But it has some advantages as well: - it's fully tail-recursive - it's fully lazy --- one of the examples is an infinite list of integers, which you can apply the 'map' from another example to, etc. - it's not incredibly complex, although it's more complex than I'd like; it's under 600 lines of code, and includes the following bits: - lazy graph-reduction engine - parsers for SK-combinators and lambda-calculus - an unparser for them as well - debugging output - translation from lambda-calculus into SKI-combinators - sample programs including list-processing - you can run it just by going to a web page with a JavaScript-capable browser. <html><head><title>Lambda-calculus via SK-combinators in JavaScript</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <script src="http://pobox.com/~kragen/MochiKit-1.3.1/lib/MochiKit/MochiKit.js"></script> <script type="text/javascript"> //meta http-equiv content-type charset=UTF-8 // SK-combinator and λ-calculus evaluation by graph reduction -*- coding: utf-8 -*- // CONSTRUCTORS // Constructor makes an ML-like or Haskell-like constructor (which has // very little in common with an OO constructor, or a JavaScript // constructor). Once you say // App = Constructor("App", ["Fun", "Val"]) // then you can construct an object with {Fun: 3, Val: 4} with App(3, // 4) and you can also set things on App.prototype. You can't new // App(...) though, which you could (and would have to) if App were a // JavaScript constructor. function ConstructorObject(name, argnames, argvals) { this.tagname = name for (var ii = 0; ii < argnames.length; ii++) { this[argnames[ii]] = argvals[ii] } this.argnames = argnames } ConstructorObject.prototype = { toString: function() { var rv = this.tagname for (var ii = 0; ii < this.argnames.length; ii++) { rv = rv + ' ' + this[this.argnames[ii]] } return rv }, repr: function() { var inside = map(repr, this.argvals()).join(', ') return this.tagname + '(' + inside + ')' }, argvals: function() { var self = this return map(function(name) { return self[name] }, self.argnames) }, } function ArgumentsMismatch(argvals, argnames) { this.argvals = argvals this.argnames = argnames this.toString = function () { return ('ArgumentsMismatch: got: ' + repr(this.argvals) + ' expected: ' + repr(this.argnames)) } } function Constructor(name, argnames) { var newclass = function() { ConstructorObject.apply(this, arguments) } newclass.prototype = new ConstructorObject(name, argnames, argnames) var rv = function() { if (arguments.length != argnames.length) { throw new ArgumentsMismatch(arguments, argnames) } return new newclass(name, argnames, arguments) } rv.prototype = newclass.prototype // for convenience rv.match_type = function(obj) { return name == obj.tagname } return rv } // GRAPH NODES // There are three types of graph nodes: applications (App), variables // (Var), and indirection nodes (Ind). An application is a function // being applied to an argument, and a variable is an atom like 'S' or // 'K' or '1' or '+'. An indirection node is merely an alias for some // other node; they exist because we sometimes need to actually mutate // graph nodes into other graph nodes, specifically for K and I. // Originally I had separate constructors for App and Var, but // sometimes in graph-reduction, you need to mutate an App into a Var, // or at least an Ind to a Var. I want different behavior (e.g. for // rendering) in the different kinds of graph nodes, but I don't know // how to change the constructor of a JavaScript object at run-time. // In Python, I would just set the __class__ to change the behavior; // in Perl I would rebless. (The obvious approach, object.constructor // = something_else, doesn't work in my Firefox.) // So now I use a single Node constructor for all three of App, Var, // and Ind, and put the per-type code into strategy objects App_type // etc., and wrote corresponding functions for creation of each node // type. Node = Constructor("Node", ["Type", "Args"]) function delegate(name) { var method = name + '_of' return function() { return this.Type[method].apply(this.Type, this.Args) } } Node.prototype.repr = delegate('repr') Node.prototype.unparenthesized = delegate('unparenthesized') Node.prototype.parenthesized = delegate('parenthesized') Node.prototype.as_operator = delegate('as_operator') Node.prototype.become = function(other) { this.Type = other.Type this.Args = other.Args this.free_vars = other.free_vars } Var_type = { tagname: 'Var', repr_of: function(name) { return "Var(" + repr(name) + ")" }, unparenthesized_of: function(name) { return name }, parenthesized_of: function(name) { return name }, as_operator_of: function(name) { return name }, } App_type = { tagname: 'App', repr_of: function(fun, arg) { return "App(" + repr(fun) + ", " + repr(arg) + ")" }, unparenthesized_of: function(fun, arg) { return fun.as_operator() + ' ' + arg.parenthesized() }, parenthesized_of: function(fun, arg) { return '(' + this.unparenthesized_of(fun, arg) + ')' }, as_operator_of: function(fun, arg) { return this.unparenthesized_of(fun, arg) }, } Ind_type = { tagname: 'Ind', repr_of: function(target) { return "Ind(" + repr(target) + ")" }, unparenthesized_of: function(target) { return target.unparenthesized() }, parenthesized_of: function(target) { return target.parenthesized() }, as_operator_of: function(target) { return target.as_operator() }, } function Var(name) { var rv = Node(Var_type, [name]) rv.free_vars = {} rv.free_vars[name] = 1 return rv } function App(fun, arg) { var rv = Node(App_type, [fun, arg]) rv.free_vars = merge(fun.free_vars, arg.free_vars) // from MochiKit.Base return rv } function Ind(target) { // no need to create an Ind node for an Ind node if (target.Type == Ind_type) return target var rv = Node(Ind_type, [target]) rv.free_vars = target.free_vars return rv } function pretty_print(expr) { return expr.unparenthesized() } // TOKENIZING AND PARSING function tokenize(expr) { // Hooray for regexps! var tokens = expr.match(/\s+|[^\s\(\)]+|\(|\)/g) var wsp_re = /\s/ return ifilter(function(token) { return !token.match(wsp_re) }, tokens) } // parsing exceptions EmptyParens = Constructor("EmptyParens", ["Where"]) ExtraRightParen = Constructor("ExtraRightParen", ["Where"]) MissingRightParen = Constructor("MissingRightParen", []) // wow, this grammar is really easy to shift-reduce by hand :) // but I should probably change it to use the generic dumbass parser function sk_parse(expr) { var stack = [null] var tokens_so_far = [] // for error reporting forEach(tokenize(expr), function(token) { tokens_so_far.push(token) var val if (token == '(') { stack.unshift(null) return // do not construct an App } else if (token == ')') { if (stack.length == 1) throw ExtraRightParen(tokens_so_far.join(' ')) val = stack.shift() if (!val) throw EmptyParens(tokens_so_far.join(' ')) } else { val = Var(token) } if (stack[0]) stack[0] = App(stack[0], val) else stack[0] = val }) if (stack.length != 1) throw MissingRightParen() return stack[0] } UnevaluableArgument = Constructor("UnevaluableArgument", ["Fun", "Arg"]) NonNumericArgument = Constructor("NonNumericArgument", ["Fun", "Arg"]) function strict_value(context, expr) { if (expr.Type == Ind_type) expr = expr.Args[0] if (expr.Type != Var_type) expr = sk_reduce(expr) if (expr.Type != Var_type) throw UnevaluableArgument(context, expr.parenthesized()) return expr.Args[0] } function isnan(n) { return n.toString() == 'NaN' } function numeric_value(context, expr) { var string_value = strict_value(context, expr) var value = parseFloat(string_value) if (isnan(value)) throw NonNumericArgument(context, string_value) return value } function binary_arithmetic(name, fun) { return [2, function(a, b) { return Var(fun(numeric_value(name + ' (left)', a), numeric_value(name + ' (right)', b))) }] } InvalidPredicateValue = Constructor("InvalidPredicateValue", ['Value']) built_ins = { // This is the set of combinators from Simon Peyton Jones' 1987 // "The Implementation of Functional Programming Languages", // chapter 16; he says that the B, C, S', and C' combinators were // originated by David Turner in order to implement SASL in the // late 1970s, and the B* combinator by Mark Sheevel of Burroughs // Corp. as an alternative to the B' combinator // B' c f g x -> c f (g x). // S and K are sufficient to build all the others. S: [3, function(f, g, x) { return App(App(f, x), App(g, x)) }], K: [2, function(k, _) { return Ind(k) }], I: [1, function(x) { return Ind(x) }], B: [3, function(f, g, x) { return App(f, App(g, x)) }], C: [3, function(f, g, x) { return App(App(f, x), g) }], "S'": [4, function(c, f, g, x) {return App(App(c, App(f, x)), App(g, x))}], "B*": [4, function(c, f, g, x) {return App(c, App(f, App(g, x)))}], "C'": [4, function(c, f, g, x) {return App(App(c, App(f, x)), g)}], // Again, this can be done with S and K, but it's simpler to: Y: [1, function(f) { return App(f, App(Var("Y"), f)) }], // And here are some arithmetic operators. '+': binary_arithmetic('+', operator.add), '*': binary_arithmetic('*', operator.mul), '/': binary_arithmetic('/', operator.div), '-': binary_arithmetic('-', operator.sub), '%': binary_arithmetic('%', operator.mod), '&': binary_arithmetic('&', operator.and), '|': binary_arithmetic('|', operator.or), '^': binary_arithmetic('^', operator.xor), '<<': binary_arithmetic('<<', operator.lshift), '>>': binary_arithmetic('>>', operator.rshift), '>>>': binary_arithmetic('>>>', operator.zrshift), '==': binary_arithmetic('==', operator.eq), '!=': binary_arithmetic('!=', operator.ne), '>': binary_arithmetic('>', operator.gt), '<': binary_arithmetic('<', operator.lt), '>=': binary_arithmetic('>=', operator.ge), '<=': binary_arithmetic('<=', operator.le), // and a conditional 'if': [3, function(pred, conseq, alt) { var pred_rv = strict_value('if', pred) if (pred_rv == true) return Ind(conseq) else if (pred_rv == false) return Ind(alt) else throw InvalidPredicateValue(pred_rv) }], } // Suppose you want a 'cons' whose result takes two arguments and // applies the first to its contents. That's // \car.\cdr.\f.\g.f car cdr // but translating that by hand into combinators is beyond my ability // at the moment. // Suppose instead we just want 'car'. That's easy --- it's K. How about cdr? // \car.\cdr.cdr // \car.I // K I // Suppose we just want a 'cons' that doesn't afford a nil test: // \car.\cdr.\f.f car cdr // \car.\cdr.C(\f.f car) cdr // \car.\cdr.C (C I car) cdr // \car.B (C (C I car)) I // C (\car.B (C (C I car))) I // C (B B (\car. C (C I car))) I // C (B B (B C (\car. C I car))) I // C (B B (B C (B (C I) I))) I // cons = 'C (B B (B C (B (C I) I))) I' // and that works --- if you apply it to 1 and 2, you get // C (B (C I) I 1) (I 2) // and then applying that to K gives you 1, or applying to (K I) gives you 2. // the 'square' function: // \a.* a a -> S(\a.* a) I -> S(B * I) I // 'max': // \a.\b.if (> a b) a b // \a.S(\b.if (> a b) a) I // \a.S(C (\b.if (> a b)) a) I // \a.S(C (B if (\b. > a b)) a) I // \a.S(C (B if (> a)) a) I // C (\a.S(C (B if (> a)) a)) I // C (B S (\a. C (B if (> a)) a)) I // C (B S (S (\a. C (B if (> a))) I)) I // C (B S (S (B C (\a. B if (> a))) I)) I // C (B S (S (B C (B (B if) >)) I)) I ski_samples = { cons: 'C (B B (B C (B (C I) I))) I', square: 'S (B * I) I', max3: 'S (C (B if (> 3)) 3) I', max: 'C (B S (S (B C (B (B if) >)) I)) I', } function samples_div(samples, input_name, label) { var rv = [] var inp_code = '$(' + repr(input_name) + ')' for (var name in samples) { var code = inp_code + '.value = ' + repr(samples[name]) + "+ ' '; " + inp_code + '.focus()' rv.push(A({href: 'javascript:' + code + '; void 0'}, name)) rv.push(' ') } return DIV(null, label, ': ', rv) } lambda_samples = { cons: '(\\car.\\cdr.\\f.f car cdr)', fancycons: '(\\car.\\cdr.\\ifcons.\\ifnil.ifcons car cdr)', fancynil: '(\\ifcons.\\ifnil.ifnil)', fancycar: '(\\cons.cons \\car.\\cdr.car x)', fancycdr: '(\\cons.cons \\car.\\cdr.cdr x)', fact20: '(Y (\\fact.(\\x.(if (< x 2) 1 (* x (fact (- x 1))))))) 20', max: '(\\x.\\y.if (> x y) x y)', square: '(\\x.* x x)', list3: '(\\cons.\\nil.\\car.\\cdr.cons 1 (cons 2 (cons 3 nil))) (\\car.\\cdr.\\ifcons.\\ifnil.ifcons car cdr) (\\ifcons.\\ifnil.ifnil) (\\cons.cons (\\car.\\cdr.car) x) (\\cons.cons (\\car.\\cdr.cdr) x)', 'list3 with map': '(\\cons.\\nil.(\\car.\\cdr.\\map.car (map (+ 2) (cons 1 (cons 2 (cons 3 nil))))) (\\cons.cons (\\car.\\cdr.car) x) (\\cons.cons (\\car.\\cdr.cdr) x) (Y (\\map.\\f.\\list.list (\\car.\\cdr.cons (f car) (map f cdr)) (\\ifcons.\\ifnil.ifnil)))) (\\car.\\cdr.\\ifcons.\\ifnil.ifcons car cdr) (\\ifcons.\\ifnil.ifnil)', ints_from: '(\\cons.Y (\\ints.\\n.cons n (ints (+ n 1)))) (\\car.\\cdr.\\ifcons.\\ifnil.ifcons car cdr)', 'dumb fib': '(Y (\\fib.\\n.if (< n 2) 1 (+ (fib (- n 1)) (fib (- n 2)))))', } function last(list) { return list[list.length - 1] } ReductionError = Constructor("ReductionError", ["State", "Error"]) ReductionError.prototype.toString = function() { return "ReductionError in " + this.State + " because:\n" + this.Error } reduction_count = 0 function sk_reduce(expr) { var spine = [expr] for (;;) { var expr = spine[spine.length - 1] // These logging calls slow it down to a ridiculous extent: // log("examining " + expr.parenthesized() + // " in " + spine[0].parenthesized()) // logDebug(' (' + repr(expr) + ')') if (expr.Type == App_type) { spine.push(expr.Args[0]) } else if (expr.Type == Var_type) { var name = expr.Args[0] var defn = built_ins[name] if (!defn) return spine[0] var nargs = defn[0] var fun = defn[1] if (spine.length <= nargs) return spine[0] var args = [] for (var ii = 0; ii < nargs; ii++) { spine.pop() args.push(last(spine).Args[1]) // it's an App, so Args[1] is arg } try { reduction_count += 1 last(spine).become(fun.apply(this, args)) } catch (e) { throw ReductionError(spine[0].parenthesized(), e.toString()) } } else if (expr.Type == Ind_type) { spine.pop() spine.push(expr.Args[0]) } else { throw ReductionError(spine[0].parenthesized(), "Missing case for expression") } } } function eval_sk(expr) { try { return pretty_print(sk_reduce(sk_parse(expr))) } catch (e) { return "Error: " + e } } // λ-calculus implementation. For convenient typing, we use \ for λ. // The basic translation is as follows: // T[x] = x // T[E F] = T[E] T[F] // T[λx.E] = A_x[T[E]] // A_x[x] = I // A_x[y] = K y // A_x[E F] = S A_x[E] A_x[F] // A_x doesn't need to avoid replacing non-free occurrences of x // because the expression it's rewriting came from T, and therefore // contains no λs and therefore no non-free variables. // The slightly more advanced version of A_x cares about whether E and // F contain x free or not. If we use E_x and F_x to indicate those // that do: // A_x[E F] = K (E F) = J E F // A_x[E_x F] = C A_x[E_x] F // A_x[E F_x] = B E A_x[E_x] // A_x[E_x F_x] = S A_x[E_x] A_x[F_x] // We already have Var and App types from the combinator calculus; we // can reuse those. Now we just need lambda-abstractions: Abstraction_type = Constructor("Abstraction", ["Arg", "Body"]) Abstraction_type.prototype.unparenthesized = function() { return '\\' + this.Arg + '.' + this.Body.unparenthesized() } Abstraction_type.prototype.parenthesized = function() { return '(' + this.unparenthesized() + ')' } Abstraction_type.prototype.as_operator = function() { return this.parenthesized() } function Abstraction(arg, body) { var rv = Abstraction_type(arg, body) rv.free_vars = merge(body.free_vars) rv.free_vars[arg] = null return rv } NotLambdaExpression = Constructor("NotLambdaExpression", ["What"]) function translate(lamex) { if (lamex.tagname == 'Abstraction') { return remove(lamex.Arg, translate(lamex.Body)) } else if (lamex.Type == Var_type) { return lamex } else if (lamex.Type == App_type) { return App(translate(lamex.Args[0]), translate(lamex.Args[1])) } else { throw NotLambdaExpression(lamex) } } // Their functionality comes from the magic table built_ins up higher. S_combinator = Var('S') K_combinator = Var('K') I_combinator = Var('I') B_combinator = Var('B') C_combinator = Var('C') // remove a variable from a lambda-expression by SKI-translation function remove(varname, expr) { // we expect expr to be combinators, not lambdas if (expr.Type == App_type) { if (!expr.free_vars[varname]) return App(K_combinator, expr) var rator = expr.Args[0], rand = expr.Args[1] if (rator.free_vars[varname] && rand.free_vars[varname]) { return App(App(S_combinator, remove(varname, rator)), remove(varname, rand)) } else if (rator.free_vars[varname]) { return App(App(C_combinator, remove(varname, rator)), rand) } else { return App(App(B_combinator, rator), remove(varname, rand)) } } else if (expr.Args[0] == varname) { return I_combinator } else { return App(K_combinator, expr) } } VarToken = { match_type: function(maybe_var) { return Node.match_type(maybe_var) && maybe_var.Type == Var_type } } Expr = { match_type: function(maybe_expr) { return Abstraction_type.match_type(maybe_expr) || Node.match_type(maybe_expr) } } lambda_productions = [ ['(', Expr, ')'], function(a, b, c) { return b }, // Can I simplify these cases? ['(', '\\', VarToken, '.', Expr, ')'], function(_, _, v, _, e, _) { return Abstraction(v.Args[0], e) }, ['.', '\\', VarToken, '.', Expr, ')'], function(dot, _, v, _, e, rp) { return [dot, Abstraction(v.Args[0], e), rp] }, [Expr, '\\', VarToken, '.', Expr, ')'], function(rator, _, v, _, e, rp) { return [rator, Abstraction(v.Args[0], e), rp] }, [Expr, Expr], App, // These occur when the second Expr is a lambda-abstraction that // just got reduced at the very end. ['(', Expr, Expr, ')'], function(_, a, b, _) { return App(a, b) }, ['.', Expr, Expr, ')'], function(x1, a, b, x2) { return [x1, App(a, b), x2] }, [Expr, Expr, Expr, ')'], function(x1, a, b, x2) { return [x1, App(a, b), x2] }, ] ParseError = Constructor("ParseError", ["State"]) function parse(tokens, productions) { var stack = [] forEach(tokens, function(tok) { stack.push(tok) while (reduce_stack(stack, productions)) { } }) if (stack.length != 1) { throw ParseError(map(repr, stack)) } return stack[0] } function reduce_stack(stack, productions) { for (var ii = 0; ii < productions.length; ii += 2) { if (match_top(stack, productions[ii])) { var len = productions[ii].length var args = [] while (args.length < len) args.unshift(stack.pop()) var result = productions[ii+1].apply(null, args) if (result.constructor != Array) result = [result] forEach(result, function(item) { stack.push(item) }) return true } } return false } function match_top(stack, prod) { for (var ii = 0; ii < prod.length; ii++) { var si = stack.length - ii - 1 if (si < 0) return false if (!match(prod[prod.length - ii - 1], stack[si])) return false } return true } function match(type, obj) { if (type.constructor == String) { return (type == obj) } else { return type.match_type(obj) } } function tokenize_lambda(expr) { // Hooray for regexps! var tokens = expr.match(/[^\s.\\\(\)]+|\.|\\|\(|\)/g) var var_re = /[^\s.\\\(\)]/ return imap(function(token) { if (token.match(var_re)) { return Var(token) } else { return token } }, chain(['('], tokens, [')'])) } function lambda_parse(expr) { return parse(tokenize_lambda(expr), lambda_productions) } </script> <script type="text/javascript"> focusOnLoad('input') function handle_input() { var start = new Date() var start_reductions = reduction_count if ($('lambda').value) { try { var lambdas = lambda_parse($('lambda').value) $('lambda').value = lambdas.unparenthesized() $('combinators').value = translate(lambdas).unparenthesized() } catch (e) { $('results').value += '' + e + '\n' + e.stack + '\n' } } var end_translate = new Date() replaceChildNodes('translatesecs', (end_translate.getTime() - start.getTime()) / 1000) $('results').value += eval_sk($('combinators').value) + '\n' replaceChildNodes('evalsecs', ((new Date()).getTime() - end_translate.getTime()) / 1000) replaceChildNodes('reductions', reduction_count - start_reductions) } </script> </head><body> <h1>SK combinators</h1> <form onsubmit="handle_input(); return false" > <textarea id="results" cols="80" rows="20"></textarea> <br /> <table cellpadding="0" rowpadding="0"> <tr><td align="right">SKI</td> <td>: <input id="combinators" value="S (B * I) I 9" size="75" onchange="$('lambda').value = ''" /></td></tr> <tr><td align="right">λ</td> <td>: <input id="lambda" value="" size="75" /></td></tr> </table> <input type="submit" /> </form> <script type="text/javascript"> document.write(toHTML(samples_div(ski_samples, 'combinators', 'SKI samples'))) </script> <script type="text/javascript"> document.write(toHTML(samples_div(lambda_samples, 'lambda', 'λ samples'))) </script> <span id="translatesecs">N</span> seconds to translate to SKI, then <span id="evalsecs">N</span> seconds to evaluate (<span id="reductions">N</span> reductions). </body></html> -- Kragen Javier Sitaker in Buenos Aires, trying to get a clue
