Hi, I'm currently in the progress of porting a template system [1] to Perl 6 [2].
It's fun to write the parser as a Perl 6 grammar, but there's one thing that I don't know how to solve elegantly. The markup format allows arbitrary text, and optionally some tags interleaved. Some of them stand on their own, like [% setvar title Grammars: parse tags of which only some ... %] And others have opening/closing pairs, and their proper nesting needs to be enforced, for example [% ifvar title %] <h1>[% readvar title %] [% endifvar %] (yes, the syntax is horrible, but when you write that stuff, 90% is normal text, and only 10% markup or so, so that's OK, more or less). Additionally, what goes between nested tags depends on the tags, for example the [% verbatim %] ... [% endverbatim %] tag pair allows unmatched [% in between (but that's the only case, so I could cheat a bit if necessary). I've started to parse the simple tags like this: our $open = '[%'; our $close = '%]'; ... token chunk { <literal> | <directive> } token directive { $open ~ $close [<.ws> <command> <.ws> ] } proto token command { <...> } token command:sym<comment> { <sym> [ <!before $close> .]* } token command:sym<include> { <sym> <.ws> <arg> } rule command:sym<setvar> { <sym> <name> '='? <slurpy_arg> } rule command:sym<readvar> { <sym> <name> } which seems to be a fairly idiomatic way, and factors out the matching of open/closing delimiters. But that way, I don't see how I can properly check for closing tags that follow some of the opening tags. I could cheat, and say rule command:sym<ifvar> { <sym> <name> '%]' <chunks>* '[%' 'endifvar' } But that would be, well, cheating (and would probably mess up the backtracking control). Another idea is to <directive> a proto token, and have each branch match its own '[%'. But that's rather repetitive. Another idea is to have nested_command and a single_command rules, and use them as alternations, thus duplicating the matching of the delimiter only twice. (Upon more reflection, this seems like the best approach so far). Can anybody think of a pattern that solves this problem even more elegantly, hopefully without any repetition? Cheers, Moritz [1] http://perlgeek.de/en/software/mowyw [2] http://github.com/moritz/6mowyw