Re: [bitc-dev] Implementation of indentation-aware parse

John Leuner Sat, 28 Feb 2009 12:58:43 -0800

On Thu, 2009-02-26 at 11:48 -0500, Jonathan S. Shapiro wrote:
> Some languages have used indentation to encode block structure, most
> obviously python. I do not personally like this approach, and I have
> not come to any decision about it, but there is now some empirical
> evidence that (a) users find it more readable, and (b) current
> programmers will elect to use it by choice when the opportunity is
> presented. [c.f. F#].
> 
> Can anyone explain how one goes about *implementing* an
> indentation-aware syntax? Does this require a hand parser, or can it
> be done with lexer thuggery? I can probably work it out if I give it
> some thought, but I haven't had a chance to do that.


I have implemented parsers in the OMeta language that handle indentation
matching as part of the PEG rules. PEG grammars are 'scannerless' in
that they don't have to have lexers and this provides control over
whitespace (but with denser grammars).

In OMeta it is possible to pass arguments through to each subrule, I use
this to pass the current level of indentation.

To establish a new level of indentation the 'inset' rule takes a
previous inset as an argument and then parses more whitespace to yield
the new inset.

inset :inset = match-inset(inset) ws+:new-whitespace -> << combine the
two whitespace variables >>,

'match-inset' uses a primitive that just matches its argument:

match-inset :inset = seq(inset),

Then it becomes possible to have a rule like:

indented-expression :inset = match-inset(inset) ((if-exp(inset) |
loop-form(inset)) newline

and 

method-body :inset = newline inset(inset):inset
indented-expression(inset))*

(in the rule above inset(inset):inset matches the previous inset and
redefines the local inset variable to contain the new inset)



In previous versions I did not thread the inset variable through all the
rules and used a dynamically scoped variable instead. This relied on
recursive implementation of the parsing machinery and the ability to
modify dynamic bindings with Common Lisp's 'progv' form. I abandoned
that approach in favour of argument passing because the simpler parser
implementation was more portable and faster.


See the OMeta page for a description of the language and the reference
implementation.

http://www.cs.ucla.edu/~awarth/ometa/


John Leuner


_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Re: [bitc-dev] Implementation of indentation-aware parse

Reply via email to