On Thu, 2009-02-26 at 11:48 -0500, Jonathan S. Shapiro wrote: > Some languages have used indentation to encode block structure, most > obviously python. I do not personally like this approach, and I have > not come to any decision about it, but there is now some empirical > evidence that (a) users find it more readable, and (b) current > programmers will elect to use it by choice when the opportunity is > presented. [c.f. F#]. > > Can anyone explain how one goes about *implementing* an > indentation-aware syntax? Does this require a hand parser, or can it > be done with lexer thuggery? I can probably work it out if I give it > some thought, but I haven't had a chance to do that.
I have implemented parsers in the OMeta language that handle indentation matching as part of the PEG rules. PEG grammars are 'scannerless' in that they don't have to have lexers and this provides control over whitespace (but with denser grammars). In OMeta it is possible to pass arguments through to each subrule, I use this to pass the current level of indentation. To establish a new level of indentation the 'inset' rule takes a previous inset as an argument and then parses more whitespace to yield the new inset. inset :inset = match-inset(inset) ws+:new-whitespace -> << combine the two whitespace variables >>, 'match-inset' uses a primitive that just matches its argument: match-inset :inset = seq(inset), Then it becomes possible to have a rule like: indented-expression :inset = match-inset(inset) ((if-exp(inset) | loop-form(inset)) newline and method-body :inset = newline inset(inset):inset indented-expression(inset))* (in the rule above inset(inset):inset matches the previous inset and redefines the local inset variable to contain the new inset) In previous versions I did not thread the inset variable through all the rules and used a dynamically scoped variable instead. This relied on recursive implementation of the parsing machinery and the ability to modify dynamic bindings with Common Lisp's 'progv' form. I abandoned that approach in favour of argument passing because the simpler parser implementation was more portable and faster. See the OMeta page for a description of the language and the reference implementation. http://www.cs.ucla.edu/~awarth/ometa/ John Leuner _______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
