Hello zt,
 
both r'\d+' and r'0|1|...' match the numbers 0 and 1. Since the r'0|1|...' 
regular expression has a longer length, it is given priority (see Ply 
documentation). Is there any way to differentiate the NUM and VECTOR tokens? 
For instance, can NUM tokens start with a 0 at all? You will need to have two 
regular expressions that only match the given input for that token (that is, no 
overlap). Well, you can have overlap, as long as you know it's there and the 
one that is given priority is the one you want to have priority, but still, I 
think it is better to avoid the overlap alltogether...
 
Dennis

________________________________

Van: [email protected] namens zt
Verzonden: wo 31-12-2008 9:49
Aan: ply-hack
Onderwerp: Lex token problem




Hi all,

I am still learning how to write parser with PLY. I need to parse
following format data:
 TSET 1        001 X 0 00;
                    001 X 0 00;
                    001 X 0 00;
 TSET 7        001 X 0 00;
repeat 12      001 X 0 00;

The tokens are defined as:
t_TSET=r'TSET'
t_NUM=r'\d+'
t_MCODE=r'repeat'
t_VECTOR=r'0|1|H|L|X'

but it kept treating the first "1" at line 1 as VECTOR instead of NUM
and the "1" after "repeat" as VECTOR.
Is there a good way to fix this?

Thanks a lot!




--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ply-hack" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ply-hack?hl=en
-~----------~----~----~----~------~----~------~--~---

<<inline: winmail.dat>>

Reply via email to