Re: Seeking regex optimizer

2006-06-20 Thread andrewdalke
Mirco Wahab wrote: > Hi, are you the A.Dalke from the Schulten group (VMD) as > listed here: http://www.ks.uiuc.edu/Overview/People/former.cgi Yes. But I left there nearly a decade ago. > # naive regex '\d+9' > # find some number only if it ends by 9 > my $str="10099000

Re: Seeking regex optimizer

2006-06-20 Thread Mirco Wahab
Thus spoke [EMAIL PROTECTED] (on 2006-06-20 01:39): Hi, are you the A.Dalke from the Schulten group (VMD) as listed here: http://www.ks.uiuc.edu/Overview/People/former.cgi > Replying to me Mirco Wahab wrote: >> If you pull the strings into (?>( ... )) (atomic groups), >> this would't happen. > >

Re: Seeking regex optimizer

2006-06-20 Thread andrewdalke
Kay Schluehr replied to my question: > > Why do you want to use a regex for this? > > Because it is part of a tokenizer that already uses regexps and I do > not intend to rewrite / replace it. Switching to pytst is not a big change - there will be little impact on the rest of your code. On the ot

Re: Seeking regex optimizer

2006-06-19 Thread Kay Schluehr
[EMAIL PROTECTED] wrote: > Kay Schluehr wrote: > > I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a > > regular expression sx from it, such that sx.match(s) yields a SRE_Match > > object when s starts with an s_i for one i in [0,...,n]. > > Why do you want to use a regex for th

Re: Seeking regex optimizer

2006-06-19 Thread andrewdalke
Replying to me Mirco Wahab wrote: > If you pull the strings into (?>( ... )) (atomic groups), > this would't happen. Given that Python's re engine doesn't support this feature it doesn't really help the original poster's problem. Even if some future Python did support it, the limit to 100 named g

Re: Seeking regex optimizer

2006-06-19 Thread Mirco Wahab
Thus spoke [EMAIL PROTECTED] (on 2006-06-19 22:51): > It uses Aho-Corasick for the implementation which is fast and does what > you expect it to do. Nor does it have a problem of matching more than > 99 possible strings as the regexp approach may have. If you pull the strings into (?>( ... )) (a

Re: Seeking regex optimizer

2006-06-19 Thread andrewdalke
Kay Schluehr wrote: > I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a > regular expression sx from it, such that sx.match(s) yields a SRE_Match > object when s starts with an s_i for one i in [0,...,n]. Why do you want to use a regex for this? When you have constant strings t

Re: Seeking regex optimizer

2006-06-19 Thread gry
Kay Schluehr wrote: > I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a > regular expression sx from it, such that sx.match(s) yields a SRE_Match > object when s starts with an s_i for one i in [0,...,n]. There might > be relations between those strings: s_k.startswith(s_1) ->

Re: Seeking regex optimizer

2006-06-19 Thread John Machin
On 19/06/2006 7:06 PM, Kay Schluehr wrote: > Mirco, > > with "special characters" I mentioned control characters of regular > expressions i.e. one of ".^$()?[]{}\|+*" but not non ascii-127 > characters. > > For a workaround you simply have to "mangle" those using an escape > control character: >

Re: Seeking regex optimizer

2006-06-19 Thread Kay Schluehr
Mirco, with "special characters" I mentioned control characters of regular expressions i.e. one of ".^$()?[]{}\|+*" but not non ascii-127 characters. For a workaround you simply have to "mangle" those using an escape control character: REGEXCHAR = ".^$()?[]{}\\|+*" def mangle(s): pattern = [

Re: Seeking regex optimizer

2006-06-18 Thread Paddy
Kay Schluehr wrote: > with reverse sorting as in your proposal.The naive solution is easy to > generate but I'm sceptical about its cost effectiveness. On the other > hand I do not want to investigate this matter if somebody else already > did it thoroughly. > > Regards, > Kay Hi Kay, The only wa

Re: Seeking regex optimizer

2006-06-18 Thread Mirco Wahab
Thus spoke Kay Schluehr (on 2006-06-18 19:07): > I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a > regular expression sx from it, such that sx.match(s) yields a SRE_Match > object when s starts with an s_i for one i in [0,...,n]. There might > be relations between those strin

Re: Seeking regex optimizer

2006-06-18 Thread John Machin
On 19/06/2006 6:30 AM, Paddy wrote: > Kay Schluehr wrote: >> I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a >> regular expression sx from it, such that sx.match(s) yields a SRE_Match >> object when s starts with an s_i for one i in [0,...,n]. There might >> be relations betwe

Re: Seeking regex optimizer

2006-06-18 Thread Kay Schluehr
Paddy wrote: > Kay Schluehr wrote: > > I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a > > regular expression sx from it, such that sx.match(s) yields a SRE_Match > > object when s starts with an s_i for one i in [0,...,n]. There might > > be relations between those strings: s

Re: Seeking regex optimizer

2006-06-18 Thread Paddy
Kay Schluehr wrote: > I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a > regular expression sx from it, such that sx.match(s) yields a SRE_Match > object when s starts with an s_i for one i in [0,...,n]. There might > be relations between those strings: s_k.startswith(s_1) ->

Seeking regex optimizer

2006-06-18 Thread Kay Schluehr
I have a list of strings ls = [s_1,s_2,...,s_n] and want to create a regular expression sx from it, such that sx.match(s) yields a SRE_Match object when s starts with an s_i for one i in [0,...,n]. There might be relations between those strings: s_k.startswith(s_1) -> True or s_k.endswith(s_1) ->