I'm currently in the progress of porting a template system [1] to Perl 6

It's fun to write the parser as a Perl 6 grammar, but there's one thing
that I don't know how to solve elegantly.

The markup format allows arbitrary text, and optionally some tags
interleaved. Some of them stand on their own, like

[% setvar title Grammars: parse tags of which only some ... %]

And others have opening/closing pairs, and their proper nesting needs to
be enforced, for example

[% ifvar title %]
    <h1>[% readvar title %]
[% endifvar %]

(yes, the syntax is horrible, but when you write that stuff, 90% is
normal text, and only 10% markup or so, so that's OK, more or less).

Additionally, what goes between nested tags depends on the tags, for
example the [% verbatim %] ... [% endverbatim %] tag pair allows
unmatched [% in between (but that's the only case, so I could cheat a
bit if necessary).

I've started to parse the simple tags like this:

our $open  = '[%';
our $close = '%]';


token chunk { <literal> | <directive> }

token directive {
    $open ~ $close
    [<.ws> <command> <.ws> ]

proto token command { <...> }
token command:sym<comment> { <sym>  [ <!before $close> .]* }
token command:sym<include> { <sym> <.ws> <arg> }
rule  command:sym<setvar>  { <sym> <name> '='? <slurpy_arg> }
rule  command:sym<readvar> { <sym> <name> }

which seems to be a fairly idiomatic way, and factors out the matching
of open/closing delimiters. But that way, I don't see how I can properly
check for closing tags that follow some of the opening tags.

I could cheat, and say

rule  command:sym<ifvar>   { <sym> <name> '%]' <chunks>* '[%' 'endifvar' }

But that would be, well, cheating (and would probably mess up the
backtracking control).

Another idea is to <directive> a proto token, and have each branch match
its own '[%'. But that's rather repetitive.

Another idea is to have nested_command and a single_command rules, and
use them as alternations, thus duplicating the matching of the delimiter
only twice. (Upon more reflection, this seems like the best approach so

Can anybody think of a pattern that solves this problem even more
elegantly, hopefully without any repetition?


[1] http://perlgeek.de/en/software/mowyw
[2] http://github.com/moritz/6mowyw

Reply via email to