> Below is my test app with broken grammar and test phrases -- am I
> missing something painfully obvious?

Well I'm not sure how obvious it is, but it looks like you have your
lookaheads a little confused, also it seems you have overlooked the
implications of greedy matching.

  nekkid_phrase: word(s) ...!reserved_word

Says, give me words not follwed by a reserved word.  Since the word rule
matches reserved_words as well, and its is used with (s) modifier, it will
simply gobble them all up, find nothing left (thus satisfying the lookahead)
and succeed.

What you need to do is use the negative lookahead to ensure that word itself
doesnt match a reserved word:

  word: ...!reserved_word /yada-yada/

Which solves that problem, but exposes a second.  That is that 
 

  query:   phrase (reserved_word phrase)(?)

Will only match one of the following lines

  phrase reserved_word phrase
  phrase

Which means that if there are more than one reserved word it will stop
matching after the phrase following the first reserved word (without
throwing an exception as you didnt terminate the query with a /\z/). The
solution is to change the (?) to a (s?) so that it means "0 or more" instead
of "0 or 1".

So once you apply the above changes you get the following grammar, which
from my tests seems to correctly parse your test data. (I changed EOG to
EOGRAMMAR for reasons of paranoia)

my $grammar =<<'EOGRAMMAR';

query:   phrase (reserved_word phrase)(s?) EOSTR
       | <error>

phrase:   quoted_phrase
        | nekkid_phrase

quoted_phrase:   '"' /[^\"]+/ '"'
               | "'" /[^\']+/ "'"

nekkid_phrase: word(s)

reserved_word:  /AND\s+NOT|AND|NOT|OR|NEAR/

word: ...!reserved_word /[^\s\{\}\(\)]+/ { [ @item[0,2] ] }

EOSTR: /\z/

EOGRAMMAR


HTH

Yves

Reply via email to