Hello zt, both r'\d+' and r'0|1|...' match the numbers 0 and 1. Since the r'0|1|...' regular expression has a longer length, it is given priority (see Ply documentation). Is there any way to differentiate the NUM and VECTOR tokens? For instance, can NUM tokens start with a 0 at all? You will need to have two regular expressions that only match the given input for that token (that is, no overlap). Well, you can have overlap, as long as you know it's there and the one that is given priority is the one you want to have priority, but still, I think it is better to avoid the overlap alltogether... Dennis
________________________________ Van: [email protected] namens zt Verzonden: wo 31-12-2008 9:49 Aan: ply-hack Onderwerp: Lex token problem Hi all, I am still learning how to write parser with PLY. I need to parse following format data: TSET 1 001 X 0 00; 001 X 0 00; 001 X 0 00; TSET 7 001 X 0 00; repeat 12 001 X 0 00; The tokens are defined as: t_TSET=r'TSET' t_NUM=r'\d+' t_MCODE=r'repeat' t_VECTOR=r'0|1|H|L|X' but it kept treating the first "1" at line 1 as VECTOR instead of NUM and the "1" after "repeat" as VECTOR. Is there a good way to fix this? Thanks a lot! --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ply-hack" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ply-hack?hl=en -~----------~----~----~----~------~----~------~--~---
<<inline: winmail.dat>>
