thanks for the responses. i've done a second take on the TAP grammar
in perl6 format, with the help of patrick michaud. i'm pretty sure
it's correct now, at least as per your previous grammar. have a look:

grammar TAP;

## Ovid's TAP grammar, translated, corrected, and rendered idiomatic
## NOTE: not yet extended to deal with 'Bail out!' etc.
token tap         { <plan> <lines> | <lines> <plan> <comment>* }
token plan        { <'1..'> \d+ \n }
token lines       { <line>+ }
token line        { <test> | <comment> }
token test        {
   <status>
   [ <' '> (<[1..9]> \d*) ]?              ## assumes a single space, not \h+
   \h* <description>? \h* <directive>? \n
                 }
token status      { <'not '>? <'ok'> }
token description { <after \h> <-[#\n]>+ }
token directive   { <after \h> <'# '> [:i todo | skip ] \N* }
token comment     { <'#'> \N* \n }

notes:
~ there are some places i took liberties (eg. \d versus [:digit:], \N
verses [:print:] (are comments, directives, and description really
limited to printable characters?).) this leaves fewer named rules,
which is a significant optimization. if it is incorrect, it can be
modified to match the semantics of ovid's grammar more closely.
~ removed unnecessary indirections, like the 'tests' rule
~ reduced backtracking
~ ovid defines lines/line, but never uses them. this is fixed
~ not yet running on parrot against the straps tests, so some
assumptions about whitespace may be incorrect
~ assumed ordering of comments/tests is significant, otherwise 'lines'
can be replaced with the body of 'line'

overall, i think it looks really clean in perl6, and it will parse
really nicely. $<tap><lines> will be an array of all the lines, and
each element of $<tap><lines> will have either a <test> or a <comment>
key.

as an aside, patrick asked why C<1..0> was valid. i explained
C<no_plan>, and he asked why it wasn't C<1..> or the more perl6ish
C<1..*>. i know TAP isn't perl-only, but i have to say, C<1..*> is
really growing on me... 'one to whatever.' it looks and sounds better
than 'one to zero.' anyone else have thoughts on that?

~jerry

Reply via email to