>> I think the next step I would take (if that wasn't sufficient) is to >> create a test suite that tests each grammar rule independently, >> successively building up to the complex input that is failing. > > I'd been doing this for terminals, but it seems like non-terminals get > harder to write as unit tests, parser state and all. I suppose integration > tests could work, but it seems the more complex the grammatical structure, > the more I experience diminishing returns with tests. I just end up with > tests that fail mysteriously which is my original problem. Do you have any > examples about? I wonder if maybe there's something essential I'm failing > to understand. Maybe looking at your tests might send me along to an a-ha > moment.
I don't have any tests to show you but I will try to create an example of what I mean. (I assume that the parser will fail if it cannot parse the entire input). Given the following grammar rules: identifier = letter (letter | digit)* method-call = identifier (ws+ argument)+ (ws matches white space) argument = number-literal | identifier number-literal = digit+ I would write tests something like this: test(number-literal, "1234") test(identifier, "foo") test(argument, "5") test(argument, "bar") test(method-call, "method a b c 3") Obviously this is a made-up example but I hope it shows you what I mean. John > Thanks again! > >> John >> >> >> On Tue, 2011-12-13 at 23:17 -0800, Casey Ransberger wrote: >>> I know this has come up before. Hopefully I'm not about to repeat a >>> lot. >>> >>> Debugging this stuff just seems really hard. And significantly harder >>> than what I've experienced working with e.g. Yacc. >>> >>> Hypothesis: Yacc had a lot of time to bake before I ever found it. PEGs >>> are new, so there's been less overall experience with debugging them. >>> >>> I've experimented in what little time I can devote with OMeta, >>> PetitParser, and Treetop. The debugging experience has been roughly >>> consistent across all three. >>> >>> One particular issue which has bugged me: memoization seems to carry a >>> lot of instance-state that's really hard to comprehend when the grammar >>> isn't working as I expect. It's just really hard to use that ocean of >>> information to figure out what I've done wrong. >>> >>> Given that with these new parsing technologies, we're pretty lucky to >>> see "parse error" as an error message, I can't help but think that it's >>> worth studying debugging strategies. Heh. :D I'm really not >>> complaining, I'm just pointing it out. >>> >>> Has anyone here found any technique(s) which makes debugging a grammar >>> written for a PEG/packrat less of a pain in the butt? >>> >>> I'd be really interested in hearing about it. >>> >>> >>> >>> _______________________________________________ >>> fonc mailing list >>> fonc@vpri.org >>> http://vpri.org/mailman/listinfo/fonc >> >> >> >> _______________________________________________ >> fonc mailing list >> fonc@vpri.org >> http://vpri.org/mailman/listinfo/fonc > _______________________________________________ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc