Thanks for the link about inefficient regexp matching. I'll admit I didn't know much about what python (in my case) was doing under the covers for its regexp matching. I found myself wondering why the code doesn't check whether backreferences were being used, and chose its algorithm appropriately. It appears I did find one of these rare cases (though I'll admit I'm still not quite certain why).
Brock On Feb 20, 4:24 am, Franck Pommereau <[email protected]> wrote: > > I've got a grammer (i'll add it below) for which lexing time seems to > > grow exponentially in the length of a single string. > > Probably that's because PLY is based on module re, and regexps matching > is implemented in Python (as in many other languages) using > backtracking. This leads to exponential run times in cases that are > fortunately quite rare in practice. Maybe you've just found one of those > cases. > > This issue about inefficient regexps matching is very well explained > on:http://swtch.com/~rsc/regexp/regexp1.html > > This page also explains how an efficient implementation can be obtained > for the subset of regexps that corresponds to rational languages. > > Maybe that could be enough to implement a lexer easily but I'm not sure. > I've implemented regexps with efficient matching in Python for teaching > purpose, but when it came to implement a lexer, I preferred a solution > based on re, which requires only a few lines. (The code is > athttp://lacl.univ-paris12.fr/pommereau/tlfbut the explanations are in > French, the code uses English identifiers but is not commented.) > > Cheers, > Franck --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ply-hack" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ply-hack?hl=en -~----------~----~----~----~------~----~------~--~---
