subject:"Unwanted failure and FAILGOAL"

Re: Unwanted failure and FAILGOAL

2016-05-11 Thread Moritz Lenz


Hi,

On 05/11/2016 07:45 AM, Richard Hainsworth wrote:

I have the following in a grammar

 rule TOP{ ^ + $ };

 rule statement  {  '=' 
  | { { self.panic($/, "Declaration syntax
incorrect") } }
 };

 rule endvalue   {  '(' ~ ')' 
  | { self.panic($/, "Invalid declaration.") }
 }

The grammar parses a correct input file up until the end of the file. At
that point even if there is no un-consumed input, there is an attempt to
match , which fails. The failure causes the panic with 'Declaration
syntax'.

Am I missing something simple here?

I would have thought  (though this is only a very newbie assumption)
that if the end of the input being sent to the grammar has been reached
after the last  has been matched, then there should be no
reason for the parse method to try to match  again, and if it
fails to test for the end of input.


This is not how regexes or grammars work.

The + quantifier tries as many times as possible to match the regex. It 
doesn't look ahead to see if more characters are available, and it 
doesn't know about the end-of-string anchor that comes next in the grammar.


In fact, it doesn't know if the rule it quantifies might have a way to 
match zero characters. In this case, it would be wrong behavior to not 
do a zero-width at the end of the string.


As for improving the error reporting from within a grammar, there are 
lots of way to get creative, and I'd urge you to read Perl 6's own 
grammar, which is a good inspiration for that.

See https://github.com/rakudo/rakudo/blob/nom/src/Perl6/Grammar.nqp

One thing you could do is structure the statement rule differently:

rule statement {

   [  '=' 
   || { self.panic($/, "Invalid declaration.")
   ]
}

And maybe also TOP:

rule TOP{ ^ [  || . { self.panic($/, "Expected a 
statement") } ] $ };


That extra dot before the panic ensures it's not called at the end of 
the string. If you don't want that, you could also do


[  || $ || { self.panic(...) } ]

Cheers,
Moritz

Re: Unwanted failure and FAILGOAL

2016-05-11 Thread Damian Conway

Hi Richard,

Not a complete answer to your question;
just an observation about your grammar:

>  rule TOP{ ^ + $ };
>
>  rule statement  {  '=' 
>  | { { self.panic($/, "Declaration syntax incorrect") } }
>  };
>
>  rule endvalue   {  '(' ~ ')' 
>  | { self.panic($/, "Invalid declaration.") }
>  }

That's more or less the equivalent of:

sub TOP   {
die if !at_start_of_input();
loop { last unless try statement() };
die if !at_end_of_input();
  }

sub statement {
try { id(); match_literal('='); endvalue() }
  or
die "Declaration syntax incorrect";
  }

sub endvalue  {
try { keyword(); match_literal('('); pairlist();
match_literal(')') }
  or
die "Invalid declaration."
  }

In which case, would you really have expected a call to TOP()
NOT to throw an exception from statement(), the first time statement()
couldn't match (as it inevitably won't if we're at the end of the input)???

If these were subroutines, I suspect you'd have written something
more like:

sub statement {
try { id(); match_literal('='); endvalue() }
  or

*!at_end_of_input()&& die
"Declaration syntax incorrect"*
  }

sub endvalue  {
try { keyword(); match_literal('('); pairlist();
match_literal(')' }
  or

*!at_end_of_input()&& die
"Invalid declaration.";*
  }

which, in a regex, would be something like:

   rule TOP{ ^ + $ };

   rule statement  {
  '=' 
   |
* \S  **# ...means we found something else, so...*
* { self.panic($/, "Declaration syntax incorrect") }*
   };

   rule endvalue   {
  '(' ~ ')' 
   |
* \S  **# ...means we found something else, so...*
* { self.panic($/, "Invalid declaration.") }*
   }

Though, personally, I'd have been inclined to write it like this:

rule TOP{ ^  +  *[ $ |  ]*  }

rule statement  { 'ID' '='  }

rule endvalue   { 'keyword' '(' ~ ')' 'pairlist' }



*rule unexpected { $ = (\N+)  {
self.panic($/,"Expected statement but found '$'")
}}*

In other words: after the statements, we're either at the end of the input,
or else we found something unexpected, so capture it and then report it.

HTH,

Damian

Unwanted failure and FAILGOAL

2016-05-10 Thread Richard Hainsworth


I have the following in a grammar

rule TOP{ ^ + $ };

rule statement  {  '=' 
 | { { self.panic($/, "Declaration syntax 
incorrect") } }

};

rule endvalue   {  '(' ~ ')' 
 | { self.panic($/, "Invalid declaration.") }
}

The grammar parses a correct input file up until the end of the file. At 
that point even if there is no un-consumed input, there is an attempt to 
match , which fails. The failure causes the panic with 'Declaration 
syntax'.


Am I missing something simple here?

I would have thought  (though this is only a very newbie assumption) 
that if the end of the input being sent to the grammar has been reached 
after the last  has been matched, then there should be no 
reason for the parse method to try to match  again, and if it 
fails to test for the end of input.


Abstractly, it seems to me to be a bit like the difference between 
testing for the truth of a condition before entering a loop, and testing 
for the truth after the loop.


In trying to find a way out of this, I went looking for some information 
about FAILGOAL. I could not find anything in the documentation or in the 
specifications.


Google provided me with some conversations about FAILGOAL, but nothing 
about how to use it. I do not know enough about the guts of Rakudo to 
know where to look.


Would it be possible for someone who knows about this to add something 
to the Documentation on Grammars?


(As I write this, I thought  may be I need a lookahead pattern in the 
TOP rule to ensure there is still input. But even so, that seems counter 
intuitive.)

Re: Unwanted failure and FAILGOAL

Re: Unwanted failure and FAILGOAL

Unwanted failure and FAILGOAL

3 matches

Site Navigation

Mail list logo

Footer information