Re: Lex token problem

A.T.Hofkamp Mon, 05 Jan 2009 01:56:59 -0800

zt wrote:
> What would be the usually way to solve this kind problem: the same
> character at different locations meaning different TOKEN?


Other than switching to a much heavier parser frame work (namely one that can 
handle this case by itself), you are stuck with the fact that LEX has no 
context of what it is scanning (you can 'cheat' by introducing lexer states, 
but that makes the design more complicated).

Without context, LEX cannot make a decision what token to return in case of 
overlap.

As Dennis suggested already, the usual approach is to make the tokens of the 
lexer unique, and resolve the ambiguity in the parser.

You have several options, one of them is

t_NUM = r"\d\d+|2|3|4|5|6|7|8|9"
t_VECTOR = r'H|L|X'
t_ZERO_ONE= = r'0|1'

In the parser, write

(NUM | ZERO_ONE) instead of NUM, and
(VECTOR | ZERO_ONE) instead of VECTOR.

You may want to move these to seperate production rules.


Sincerely,
Albert

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ply-hack" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ply-hack?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Lex token problem

Reply via email to