To me, this looks like more of a whitespace issue. If you ask PLY to parse
something like "foo(bar)", there is no requirement that whitespace appear
between "foo" and "(". Similarly, if you have tokens for integers and
identifiers, then something like "45foo" is going to parse as "45" (int) and
"foo" (identifier). I'm not exactly sure how you might want to fix it.
Perhaps you can define a special illegal token to handle that case:
t_ILLEGALID = t_INTEGER + t_IDENTFIER
Cheers,
Dave
On Nov 30, 2009, at 8:07 AM, Paul Miller wrote:
> I've written a rather minimal s-expression parser with PLY, but I'm
> experiencing a strange bug. Since the code is rather short, I'll post
> it here:
>
> --==BEGIN lexer.py==--
> import ply.lex as lex
>
> tokens = ('INTEGER', 'FLOAT', 'STRING', 'LPAREN', 'RPAREN',
> 'IDENTIFIER',
> 'NEWLINE', 'RATIONAL')
>
> t_FLOAT = r'((\d*\.\d+)(E[\+-]?\d+)?|([1-9]\d*E[\+-]?\d+))'
> t_STRING = r'\".*?\"'
> t_LPAREN = r'\('
> t_RPAREN = r'\)'
> t_IDENTIFIER = r'[^0-9()][^()\ \t\n]*'
> t_INTEGER = r'(-)?\d+'
> t_RATIONAL = r'(-)?\d+/\d+'
>
> t_ignore = ' \t'
>
> def t_NEWLINE(t):
> r'\n'
> t.lexer.lineno += 1
>
> def t_error(t):
> '''
> Houston, we have a problem.
> '''
> print("Illegal character %s" % t.value[0])
> t.lexer.skip(1)
>
> lexer = lex.lex (optimize = 0)
>
> --==END lexer.py==--
>
> Now, when I do this:
>
>>>> from lexer import lexer
>>>>
>>>> lexer.input (' (+ 7abc 3 "xyz") ')
>>>> for token in lexer:
> ... print token
>
> I get:
>
> LexToken(LPAREN,'(',1,1)
> LexToken(IDENTIFIER,'+',1,2)
> LexToken(INTEGER,'7',1,4)
> LexToken(IDENTIFIER,'abc',1,5)
> LexToken(INTEGER,'3',1,9)
> LexToken(IDENTIFIER,'"xyz"',1,11)
> LexToken(RPAREN,')',1,16)
>>>>
>
> What I'd expect is an error matching 7abc, since it's not a valid
> identifier. The thing that makes me suspect this is a LY bug rather
> than a bug in my code is that pyscheme (http://hkn.eecs.berkeley.edu/
> ~dyoo/python/pyscheme/) builds its lexer and parser using PLY and has
> the same bug. Can anyone confirm this is a bug in PLY or am I doing
> something subtly wrong?
>
> Thanks!
>
> --
>
> You received this message because you are subscribed to the Google Groups
> "ply-hack" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/ply-hack?hl=en.
>
>
--
You received this message because you are subscribed to the Google Groups
"ply-hack" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/ply-hack?hl=en.