Re: Unwanted failure and FAILGOAL
Hi, On 05/11/2016 07:45 AM, Richard Hainsworth wrote: I have the following in a grammar rule TOP{ ^ + $ }; rule statement { '=' | { { self.panic($/, "Declaration syntax incorrect") } } }; rule endvalue { '(' ~ ')' | { self.panic($/, "Invalid declaration.") } } The grammar parses a correct input file up until the end of the file. At that point even if there is no un-consumed input, there is an attempt to match , which fails. The failure causes the panic with 'Declaration syntax'. Am I missing something simple here? I would have thought (though this is only a very newbie assumption) that if the end of the input being sent to the grammar has been reached after the last has been matched, then there should be no reason for the parse method to try to match again, and if it fails to test for the end of input. This is not how regexes or grammars work. The + quantifier tries as many times as possible to match the regex. It doesn't look ahead to see if more characters are available, and it doesn't know about the end-of-string anchor that comes next in the grammar. In fact, it doesn't know if the rule it quantifies might have a way to match zero characters. In this case, it would be wrong behavior to not do a zero-width at the end of the string. As for improving the error reporting from within a grammar, there are lots of way to get creative, and I'd urge you to read Perl 6's own grammar, which is a good inspiration for that. See https://github.com/rakudo/rakudo/blob/nom/src/Perl6/Grammar.nqp One thing you could do is structure the statement rule differently: rule statement { [ '=' || { self.panic($/, "Invalid declaration.") ] } And maybe also TOP: rule TOP{ ^ [ || . { self.panic($/, "Expected a statement") } ] $ }; That extra dot before the panic ensures it's not called at the end of the string. If you don't want that, you could also do [ || $ || { self.panic(...) } ] Cheers, Moritz
Re: Unwanted failure and FAILGOAL
Hi Richard, Not a complete answer to your question; just an observation about your grammar: > rule TOP{ ^ + $ }; > > rule statement { '=' > | { { self.panic($/, "Declaration syntax incorrect") } } > }; > > rule endvalue { '(' ~ ')' > | { self.panic($/, "Invalid declaration.") } > } That's more or less the equivalent of: sub TOP { die if !at_start_of_input(); loop { last unless try statement() }; die if !at_end_of_input(); } sub statement { try { id(); match_literal('='); endvalue() } or die "Declaration syntax incorrect"; } sub endvalue { try { keyword(); match_literal('('); pairlist(); match_literal(')') } or die "Invalid declaration." } In which case, would you really have expected a call to TOP() NOT to throw an exception from statement(), the first time statement() couldn't match (as it inevitably won't if we're at the end of the input)??? If these were subroutines, I suspect you'd have written something more like: sub statement { try { id(); match_literal('='); endvalue() } or *!at_end_of_input()&& die "Declaration syntax incorrect"* } sub endvalue { try { keyword(); match_literal('('); pairlist(); match_literal(')' } or *!at_end_of_input()&& die "Invalid declaration.";* } which, in a regex, would be something like: rule TOP{ ^ + $ }; rule statement { '=' | * \S **# ...means we found something else, so...* * { self.panic($/, "Declaration syntax incorrect") }* }; rule endvalue { '(' ~ ')' | * \S **# ...means we found something else, so...* * { self.panic($/, "Invalid declaration.") }* } Though, personally, I'd have been inclined to write it like this: rule TOP{ ^ + *[ $ | ]* } rule statement { 'ID' '=' } rule endvalue { 'keyword' '(' ~ ')' 'pairlist' } *rule unexpected { $ = (\N+) { self.panic($/,"Expected statement but found '$'") }}* In other words: after the statements, we're either at the end of the input, or else we found something unexpected, so capture it and then report it. HTH, Damian
Unwanted failure and FAILGOAL
I have the following in a grammar rule TOP{ ^ + $ }; rule statement { '=' | { { self.panic($/, "Declaration syntax incorrect") } } }; rule endvalue { '(' ~ ')' | { self.panic($/, "Invalid declaration.") } } The grammar parses a correct input file up until the end of the file. At that point even if there is no un-consumed input, there is an attempt to match , which fails. The failure causes the panic with 'Declaration syntax'. Am I missing something simple here? I would have thought (though this is only a very newbie assumption) that if the end of the input being sent to the grammar has been reached after the last has been matched, then there should be no reason for the parse method to try to match again, and if it fails to test for the end of input. Abstractly, it seems to me to be a bit like the difference between testing for the truth of a condition before entering a loop, and testing for the truth after the loop. In trying to find a way out of this, I went looking for some information about FAILGOAL. I could not find anything in the documentation or in the specifications. Google provided me with some conversations about FAILGOAL, but nothing about how to use it. I do not know enough about the guts of Rakudo to know where to look. Would it be possible for someone who knows about this to add something to the Documentation on Grammars? (As I write this, I thought may be I need a lookahead pattern in the TOP rule to ensure there is still input. But even so, that seems counter intuitive.)