[fonc] Debugging PEGs and Packrats

2011-12-13 Thread Casey Ransberger
I know this has come up before. Hopefully I'm not about to repeat a lot. 

Debugging this stuff just seems really hard. And significantly harder than what 
I've experienced working with e.g. Yacc. 

Hypothesis: Yacc had a lot of time to bake before I ever found it. PEGs are 
new, so there's been less overall experience with debugging them. 

I've experimented in what little time I can devote with OMeta, PetitParser, and 
Treetop. The debugging experience has been roughly consistent across all three. 

One particular issue which has bugged me: memoization seems to carry a lot of 
instance-state that's really hard to comprehend when the grammar isn't working 
as I expect. It's just really hard to use that ocean of information to figure 
out what I've done wrong. 

Given that with these new parsing technologies, we're pretty lucky to see 
parse error as an error message, I can't help but think that it's worth 
studying debugging strategies. Heh. :D I'm really not complaining, I'm just 
pointing it out. 

Has anyone here found any technique(s) which makes debugging a grammar written 
for a PEG/packrat less of a pain in the butt?

I'd be really interested in hearing about it. 



___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc


Re: [fonc] Debugging PEGs and Packrats

2011-12-13 Thread Wesley Smith
I use LPEG ( http://www.inf.puc-rio.br/~roberto/lpeg/ ) a lot for
writing grammars.  I'm not familiar with the ones you mention so I
have no idea how similar they are.  I too had a lot of trouble
debugging, so I ended up writing some tools that print out debugging
statements in a human readable form.  The ones I've found most useful
are listing out the matched tokens in order, the tokens attempted, and
a trace of the grammar rules the parser follows.  For reference, this
is a typical log of data I get (this one is for a Lua grammar):


1 1 - block
1  2 - chunk
1   3 - stat
14 - varlist
1 5 - var
1  6 - prefix
1 5 -: prefix
14 -: var
1   3 -: varlist
14 - functioncall
1 5 - prefix
14 -: prefix
1   3 -: functioncall
24 - funcname
2   3 - funcname 3 MATCH
34 - funcbody
4 5 - parlist
4  6 - namelist
8 5 - namelist 9 MATCH
84 - parlist 9 MATCH
10 5 - block
10  6 - chunk
10   7 - stat
108 - varlist
10 9 - var
10  10 - prefix
10 9 - prefix 10 MATCH
11  10 - suffix
11   11 - call
1112 - args
11 13 - tableconstructor
1112 -: tableconstructor
11   11 -: args
11  10 -: call
11   11 - index
11  10 -: index
11 9 -: suffix
11  10 - index
11 9 -: index
108 - var 11 MATCH
10   7 - varlist 11 MATCH
108 - functioncall
10 9 - prefix
108 - prefix 11 MATCH
11 9 - suffix
11  10 - call
11   11 - args
1112 - tableconstructor
11   11 -: tableconstructor
11  10 -: args
11 9 -: call
11  10 - index
11 9 -: index
118 -: suffix
11 9 - call
11  10 - args
11   11 - tableconstructor
11  10 -: tableconstructor
11 9 -: args
118 -: call
10   7 -: functioncall
10  6 -: stat
10   7 - laststat
10  6 -: laststat
9 5 - chunk 11 MATCH
94 - block 11 MATCH
3   3 -: funcbody
1  2 -: stat
1   3 - laststat
1  2 -: laststat
0 1 - chunk 11 MATCH
00 - block 11 MATCH

Rule Stack:
{
  idx = 9,
  [1] = block,
  [2] = chunk,
  [3] = stat,
  [4] = funcbody,
  [5] = block,
  [6] = chunk,
  [7] = stat,
  [8] = functioncall,
  [9] = prefix,
}

Attempted Tokens List:
{
  rules = {
[1] = args,
[2] = tableconstructor,
[3] = args,
[4] = call,
[5] = index,
[6] = index,
[7] = varlist,
[8] = stat,
  },
  tokens = {
[1] = LEFT_PAREN,
[2] = LEFT_BRACE,
[3] = STRING,
[4] = COLON,
[5] = LEFT_BRACKET,
[6] = DOT,
[7] = COMMA,
[8] = EQUALS,
  },
}

___
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc