If I want to parse a language that is sensitive to whitespace indentation (e.g. Python, Haskell), how do I do it using P6 rules/grammars?

The way I'd usually handle it is to have a lexer that examines leading whitespace and converts it into "indent" and "unindent" tokens. The grammer can then use these tokens in the same way that it would any other block-delimiter.

This requires a stateful lexer, because to work out the number of "unindent" tokens on a line, it needs to know what the indentation positions are. How would I write a P6 rule that defines <indent> and <unindent> tokens? Alternatively (if a different approach is needed) how would I use P6 to parse such a language?

Reply via email to