thanks for the responses. i've done a second take on the TAP grammar in perl6 format, with the help of patrick michaud. i'm pretty sure it's correct now, at least as per your previous grammar. have a look:
grammar TAP; ## Ovid's TAP grammar, translated, corrected, and rendered idiomatic ## NOTE: not yet extended to deal with 'Bail out!' etc. token tap { <plan> <lines> | <lines> <plan> <comment>* } token plan { <'1..'> \d+ \n } token lines { <line>+ } token line { <test> | <comment> } token test { <status> [ <' '> (<[1..9]> \d*) ]? ## assumes a single space, not \h+ \h* <description>? \h* <directive>? \n } token status { <'not '>? <'ok'> } token description { <after \h> <-[#\n]>+ } token directive { <after \h> <'# '> [:i todo | skip ] \N* } token comment { <'#'> \N* \n } notes: ~ there are some places i took liberties (eg. \d versus [:digit:], \N verses [:print:] (are comments, directives, and description really limited to printable characters?).) this leaves fewer named rules, which is a significant optimization. if it is incorrect, it can be modified to match the semantics of ovid's grammar more closely. ~ removed unnecessary indirections, like the 'tests' rule ~ reduced backtracking ~ ovid defines lines/line, but never uses them. this is fixed ~ not yet running on parrot against the straps tests, so some assumptions about whitespace may be incorrect ~ assumed ordering of comments/tests is significant, otherwise 'lines' can be replaced with the body of 'line' overall, i think it looks really clean in perl6, and it will parse really nicely. $<tap><lines> will be an array of all the lines, and each element of $<tap><lines> will have either a <test> or a <comment> key. as an aside, patrick asked why C<1..0> was valid. i explained C<no_plan>, and he asked why it wasn't C<1..> or the more perl6ish C<1..*>. i know TAP isn't perl-only, but i have to say, C<1..*> is really growing on me... 'one to whatever.' it looks and sounds better than 'one to zero.' anyone else have thoughts on that? ~jerry