Interesting. The global and in-rule skip directives are handled a bit differently. The in-rule directives essentially go through an eval step, where the text after the ':' and before the terminating '>' is placed literally into the generated parser code as:
     $code .= '$skip =' . $1;
So whatever whitespace and quoting structures you used get translated directly into the global eval() of the generated parser.

The global skip directive feeds the extracted skip directive text into a variable that is then put into the parser code as something like:
    $code .= '$skip = \'$1\'';

Later the internal $skip variable is used in a regex something like:
    if ($text =~ s/\A($skip)//) { SkipMatched(); ...}

So in instances where the skip directive contents could be quoted via single quotes into something that would work as a variable interpolated into a regex, the global directive works. Otherwise, you get odd behavior, as you noticed.

The correct thing to do is for P::RD to treat the global skip the same as it does everything else, and require that you quote the contents of the skip directive, and eval the result in the generated parser. I've made these changes here:
    https://github.com/jtbraun/Parse-RecDescent/tree/global_skip

And would appreciate it if you'd give them a try before I merge them back into the main line and push an update to PAUSE (and a bug would be appreciated).

Additionally, one of your test cases will fail unexpectedly:

        try_grammar('bare skip, plus sign',undef,"<skip:\\s+>");

The generated parser always attempts to match $skip before a terminal, including the first one. There's no whitespace at the beginning of your string, which will lead to a failure to parse.

Thanks for the report,

Jeremy


On 03/31/2013 08:25 AM, yary wrote:
I've been trying to get a global <skip> directive- before all rule
definitions- to work, and it seems it must not be quoted or have any
spaces. It seems to run counter to the Parse::RecDescent
documantation. Following is code, I would expect each variant to
parse, but only the first 2 are OK, and the last crashes with
"Internal error in generated parser code!". Is that intended, or
should I file a bug report?

-y

#!/usr/bin/env perl
use warnings;
use strict;
use Parse::RecDescent;

$::RD_WARN=1;

my $plain_grammar='words : (/\w+/)(s) /\Z/';

sub try_grammar {
   my ($name, $trace, $definition)=@_;
   print $name,": $definition = ";
   print $::RD_HINT=$::RD_TRACE=$trace if defined $trace;
   my $grammar = Parse::RecDescent->new("$definition\n$plain_grammar");
   print defined $grammar->words("Please work") ? "OK" : "Didn't parse";
   print "\n";
   undef $::RD_HINT,$::RD_TRACE;
}

try_grammar('no skip directive',undef,'');
try_grammar('bare skip',undef,"<skip:\\s*>");
try_grammar('bare skip, with a preceeding space',undef,"<skip: \\s*>");
try_grammar('bare skip, plus sign',undef,"<skip:\\s+>");
try_grammar('qr skip',undef,"<skip: qr/\\s*/>");
try_grammar('apostrophe skip',1,"<skip: '\\s*'>");

Reply via email to