Re: [PEG] Easy way to parse indented syntax by adding dimension?

Dustin Voss Mon, 25 Nov 2013 00:01:51 -0800

The “>>>” rule would need to have access to the rule(s) following. What you 
actually have here is a rule “>>> X” where X is a sub-rule of the ">>>" rule. 
Each occurrence of “>>>” is a distinct rule from all other “>>>”, kind of like 
C++ templates.


Your rules would have to employ some syntax to separate the “>>> X” part of the 
rule from any following sub-rules that should not be indented, as in this 
example:

block <- ‘begin’ ( >>> stmt ) ‘end’

begin
   do-something
end

You would also need a way to control how many stmt lines can show up. Can there 
be only one “stmt” line, or several? Two syntaxes that would NOT work are:

block <- ‘begin’ ( >>> stmt+ ) ‘end’
block <- ‘begin’ ( >>> stmt )+ ‘end’

The first one wouldn’t work because the “+” would have to bind to “stmt”, 
meaning several statements on one line. The second wouldn’t work because by 
wrapping the “>>>” rule itself in a sequence, each “>>>” is distinct with its 
own internal state, meaning no consistency in the indentation column.

But those are all notation issues, and can be easily solved one way or another.


On Nov 24, 2013, at 10:31 PM, Henri Tuhola <henri.tuh...@gmail.com> wrote:

> Hi again.
> 
> You can already parse indentation with PEG by tokenizing step or providing 
> context. But if you treat the input such that it holds two dimensions, 
> shouldn't it be easy to notice that indented block clearly isn't context 
> sensitive after all?
> 
> for i in range(6):
>     print(i)
>     print(i * 2)
> 
> There is very clear pattern here, and you can't really parse the indentation 
> around the block any other way. So doesn't that mean it can be done with 
> packrat parser? You only need a certain sort of extra rule for it:
> 
> stmt <- 'for' variable 'in' expression >>> stmt
> 
> The 'indent' (>>>):
> 
>  1. Memorize column index as base-indent. Make sure the line starts with this 
> structure.
>  2. Match the head pattern.
>  3. Match newline, count spaces until character found. But skip comments.
>  4. Fail if less spacing than what column index dictates.
>  5. Match body pattern.
>  6. Repeat step 3, 4, 5, until first failure, with condition that the spacing 
> must line up such that it forms a block.
> 
> This happens within single block, so it doesn't leak state around. I think 
> it's perhaps possible to synthesize a 2-D PEG. If someone figures out a way 
> to do exactly that, you could also try parse:
> 
>        y
> k = ------
>      x**2
> 
> or this, if earlier one turns out too insane:
> 
> k = y
>      ------
>      x**2
> 
> I read about someone doing parsing on scanned math expressions. So it doesn't 
> sound too impossible to consider that this might work just as well.
> _______________________________________________
> PEG mailing list
> PEG@lists.csail.mit.edu
> https://lists.csail.mit.edu/mailman/listinfo/peg


_______________________________________________
PEG mailing list
PEG@lists.csail.mit.edu
https://lists.csail.mit.edu/mailman/listinfo/peg

Re: [PEG] Easy way to parse indented syntax by adding dimension?

Reply via email to