option: /option/i /\w+/ optarg(?) ''
Hi, why doesn't the question mark in the optarg(?) below work as I expect? In the trace I see that optarg consumes the character and then the option rule fails, since the final is missing. Isn't P:RD supposed to backtrack and say ok, the optarg(?) didn't match anything here? Of course the script below works when I change the optarg from /\S+/ to /\w+/, but I'm curious, why doesn't optarg(?) mean ZERO or one here? Thank you Alex #!/usr/bin/perl -w use strict; use vars qw($parser $text %option); use Data::Dumper; use Parse::RecDescent; $RD_WARN = 1; $RD_HINT = 1; $RD_TRACE = 120; $parser = Parse::RecDescent-new(q( genfile: chunk(s) /^\Z/ chunk: option | error option: /option/i /\w+/ optarg(?) { push @{$::option{lc $item[2]}}, $item[-2]; } optarg: /\S+/ )) or die 'Bad grammar'; $text .= $_ while (DATA); defined $parser-genfile($text) or die 'Bad text'; print STDERR Data::Dumper-Dump([\%option], [qw(option)]); __DATA__ option one option two
grammar problems
Hello, I recently downloaded the Parse::RecDescent package to parse boolean queries. I have several questions, not necessarily about the package itself, but rather about my grammar. I am currently using the following grammar: my $grammar = q { autotree disj : qualif(?) conj disjRec(s?) disjRec : disjOp conj conj : term conjRec(s?) conjRec : conjOp term term : brack | phrase | ident brack : '(' disj ')' phrase: '' ident(s?) '' ident : /[a-zA-Z0-9]+/i qualif: ident '=' conjOp: /AND/i disjOp: /OR/i }; I have several problems with this. Firstly, the precedence isn't quite what I want in 'qualif'. I would like 'ident' to bind tightly to whatever comes next to it. Currently it seems to associate it with the whole 'disj' that comes after. Secondly, I would like the 'conjOp' operator to be optional, and that the parser recognises this. This means a query for 'a b' would be interpreted as 'a AND b'. I tried replacing the conjOp rule with 'conjOp : /AND/i | ', but this does not work, as now a query 'a AND b' is interpreted as 'a AND and ...'. Any help would be appreciated. Thanks, Jonas.
Re: option: /option/i /\w+/ optarg(?) ''
On Thu, 20 May 2004, [EMAIL PROTECTED] wrote: why doesn't the question mark in the optarg(?) below work as I expect? In the trace I see that optarg consumes the character and then the option rule fails, since the final is missing. Isn't P:RD supposed to backtrack and say ok, the optarg(?) didn't match anything here? Of course the script below works when I change the optarg from /\S+/ to /\w+/, but I'm curious, why doesn't optarg(?) mean ZERO or one here? Your basic problem is this: optarg: /\S+/ Because P::RD relies on the Perl 5 regex engine, it can't backtrack regular expressions on its own. It just asks Perl 5 to match an optarg and that's used subsequently. Maybe the Perl 6 grammars will do this (I think they can), but for now you have to be careful with the regular expressions you specify. This example, in particular, should exclude '' from the possible things optarg will match - so \w+ is a solution, but you are being unnecessarily strict on the optarg rule. You may like /[^]+/ better. But then, depending on how SGML-like your language is, you may have embedded '' characters in your option arguments. In your case, it seems like what I suggest would be OK. Ted
Re: option: /option/i /\w+/ optarg(?) ''
On Thursday, May 20, 2004 Alexander Farber said: Hi, why doesn't the question mark in the optarg(?) below work as I expect? In the trace I see that optarg consumes the character and then the option rule fails, since the final is missing. Isn't P:RD supposed to backtrack and say ok, the optarg(?) didn't match anything here? Um, why would it do this? In fact, optarg *does* match zero or one times, it is option that does not match. This is a simple problem with your grammar where you got (exactly...) what you asked for. As an example, if you change optarg it will be better: optarg: /[^\s]+/ Also, there is more than one way to do it but I don't like your first rule for option /option/ but would prefer '' 'option' as it is easier to maintain. If the problem is that there should not be a space between '' and 'option' than change skip. But this is not related to your question, this is just me butting into your style... ;-) Of course the script below works when I change the optarg from /\S+/ to /\w+/, but I'm curious, why doesn't optarg(?) mean ZERO or one here? Thank you Alex #!/usr/bin/perl -w use strict; use vars qw($parser $text %option); use Data::Dumper; use Parse::RecDescent; $RD_WARN = 1; $RD_HINT = 1; $RD_TRACE = 120; $parser = Parse::RecDescent-new(q( genfile: chunk(s) /^\Z/ chunk: option | error option: /option/i /\w+/ optarg(?) { push @{$::option{lc $item[2]}}, $item[-2]; } optarg: /\S+/ )) or die 'Bad grammar'; $text .= $_ while (DATA); defined $parser-genfile($text) or die 'Bad text'; print STDERR Data::Dumper-Dump([\%option], [qw(option)]); __DATA__ option one option two -- Intel, Corp. 5000 W. Chandler Blvd. Chandler, AZ 85226 -- Intel, Corp. 5000 W. Chandler Blvd. Chandler, AZ 85226
Re: grammar problems
On Thursday, May 20, 2004 Jonas Wolf said: Hello, I recently downloaded the Parse::RecDescent package to parse boolean queries. I have several questions, not necessarily about the package itself, but rather about my grammar. I am currently using the following grammar: my $grammar = q { autotree disj : qualif(?) conj disjRec(s?) disjRec : disjOp conj conj : term conjRec(s?) conjRec : conjOp term term : brack | phrase | ident brack : '(' disj ')' phrase: '' ident(s?) '' ident : /[a-zA-Z0-9]+/i qualif: ident '=' conjOp: /AND/i disjOp: /OR/i }; I have several problems with this. Firstly, the precedence isn't quite what I want in 'qualif'. I would like 'ident' to bind tightly to whatever comes next to it. Currently it seems to associate it with the whole 'disj' that comes after. You do not provide any examples of what you are trying to parse, which would help. IMHO what you mean by the above paragraph is not clear so I don't know what you want. Secondly, I would like the 'conjOp' operator to be optional, and that the parser recognises this. This means a query for 'a b' would be interpreted as 'a AND b'. I tried replacing the conjOp rule with 'conjOp : /AND/i | ', but this does not work, as now a query 'a AND b' is interpreted as 'a AND and ...'. Perhaps: conj : term conjOp(?) conj | term It might help if you followed the example given in the PR::D POD more closely. Any help would be appreciated. Thanks, Jonas. -- Intel, Corp. 5000 W. Chandler Blvd. Chandler, AZ 85226 -- Intel, Corp. 5000 W. Chandler Blvd. Chandler, AZ 85226