Re: Seeking regex optimizer

2006-06-20 Thread andrewdalke
Kay Schluehr replied to my question: Why do you want to use a regex for this? Because it is part of a tokenizer that already uses regexps and I do not intend to rewrite / replace it. Switching to pytst is not a big change - there will be little impact on the rest of your code. On the other

Re: Seeking regex optimizer

2006-06-20 Thread Mirco Wahab
Thus spoke [EMAIL PROTECTED] (on 2006-06-20 01:39): Hi, are you the A.Dalke from the Schulten group (VMD) as listed here: http://www.ks.uiuc.edu/Overview/People/former.cgi Replying to me Mirco Wahab wrote: If you pull the strings into (?( ... )) (atomic groups), this would't happen. Given

Re: Seeking regex optimizer

2006-06-20 Thread andrewdalke
Mirco Wahab wrote: Hi, are you the A.Dalke from the Schulten group (VMD) as listed here: http://www.ks.uiuc.edu/Overview/People/former.cgi Yes. But I left there nearly a decade ago. # naive regex '\d+9' # find some number only if it ends by 9 my

Re: Seeking regex optimizer

2006-06-19 Thread Kay Schluehr
Mirco, with special characters I mentioned control characters of regular expressions i.e. one of .^$()?[]{}\|+* but not non ascii-127 characters. For a workaround you simply have to mangle those using an escape control character: REGEXCHAR = .^$()?[]{}\\|+* def mangle(s): pattern = []

Re: Seeking regex optimizer

2006-06-19 Thread John Machin
On 19/06/2006 7:06 PM, Kay Schluehr wrote: Mirco, with special characters I mentioned control characters of regular expressions i.e. one of .^$()?[]{}\|+* but not non ascii-127 characters. For a workaround you simply have to mangle those using an escape control character: REGEXCHAR =

Re: Seeking regex optimizer

2006-06-19 Thread gry
Kay Schluehr wrote: I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a regular expression sx from it, such that sx.match(s) yields a SRE_Match object when s starts with an s_i for one i in [0,...,n]. There might be relations between those strings: s_k.startswith(s_1) - True

Re: Seeking regex optimizer

2006-06-19 Thread andrewdalke
Kay Schluehr wrote: I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a regular expression sx from it, such that sx.match(s) yields a SRE_Match object when s starts with an s_i for one i in [0,...,n]. Why do you want to use a regex for this? When you have constant strings

Re: Seeking regex optimizer

2006-06-19 Thread Mirco Wahab
Thus spoke [EMAIL PROTECTED] (on 2006-06-19 22:51): It uses Aho-Corasick for the implementation which is fast and does what you expect it to do. Nor does it have a problem of matching more than 99 possible strings as the regexp approach may have. If you pull the strings into (?( ... ))

Re: Seeking regex optimizer

2006-06-19 Thread andrewdalke
Replying to me Mirco Wahab wrote: If you pull the strings into (?( ... )) (atomic groups), this would't happen. Given that Python's re engine doesn't support this feature it doesn't really help the original poster's problem. Even if some future Python did support it, the limit to 100 named

Re: Seeking regex optimizer

2006-06-19 Thread Kay Schluehr
[EMAIL PROTECTED] wrote: Kay Schluehr wrote: I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a regular expression sx from it, such that sx.match(s) yields a SRE_Match object when s starts with an s_i for one i in [0,...,n]. Why do you want to use a regex for this?

Re: Seeking regex optimizer

2006-06-18 Thread Paddy
Kay Schluehr wrote: I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a regular expression sx from it, such that sx.match(s) yields a SRE_Match object when s starts with an s_i for one i in [0,...,n]. There might be relations between those strings: s_k.startswith(s_1) - True

Re: Seeking regex optimizer

2006-06-18 Thread Kay Schluehr
Paddy wrote: Kay Schluehr wrote: I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a regular expression sx from it, such that sx.match(s) yields a SRE_Match object when s starts with an s_i for one i in [0,...,n]. There might be relations between those strings:

Re: Seeking regex optimizer

2006-06-18 Thread John Machin
On 19/06/2006 6:30 AM, Paddy wrote: Kay Schluehr wrote: I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a regular expression sx from it, such that sx.match(s) yields a SRE_Match object when s starts with an s_i for one i in [0,...,n]. There might be relations between those

Re: Seeking regex optimizer

2006-06-18 Thread Mirco Wahab
Thus spoke Kay Schluehr (on 2006-06-18 19:07): I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a regular expression sx from it, such that sx.match(s) yields a SRE_Match object when s starts with an s_i for one i in [0,...,n]. There might be relations between those strings:

Re: Seeking regex optimizer

2006-06-18 Thread Paddy
Kay Schluehr wrote: SNIP with reverse sorting as in your proposal.The naive solution is easy to generate but I'm sceptical about its cost effectiveness. On the other hand I do not want to investigate this matter if somebody else already did it thoroughly. Regards, Kay Hi Kay, The only way