James Stroud wrote: > John Pye wrote: >> Hi all >> >> I have a file with a bunch of perl regular expressions like so: >> >> /(^|[\s\(])\*([^ ].*?[^ ])\*([\s\)\.\,\:\;\!\?]|$)/$1'''$2'''$3/ # >> bold >> /(^|[\s\(])\_\_([^ ].*?[^ ])\_\_([\s\)\.\,\:\;\!\?]|$)/$1''<b>$2<\/ >> b>''$3/ # italic bold >> /(^|[\s\(])\_([^ ].*?[^ ])\_([\s\)\.\,\:\;\!\?]|$)/$1''$2''$3/ # >> italic >> >> These are all find/replace expressions delimited as '/search/replace/ >> # comment' where 'search' is the regular expression we're searching >> for and 'replace' is the replacement expression. >> >> Is there an easy and general way that I can split these perl-style >> find-and-replace expressions into something I can use with Python, eg >> re.sub('search','replace',str) ? >> >> I though generally it would be good enough to split on '/' but as you >> see the <\/b> messes that up. I really don't want to learn perl >> here :-) >> >> Cheers >> JP >> > > This could be more general, in principal a perl regex could end with a > "\", e.g. "\\/", but I'm guessing that won't happen here. > > py> for p in perlish: > ... print p > ... > /(^|[\s\(])\*([^ ].*?[^ ])\*([\s\)\.\,\:\;\!\?]|$)/$1'''$2'''$3/ > /(^|[\s\(])\_\_([^ ].*?[^ ])\_\_([\s\)\.\,\:\;\!\?]|$)/$1''<b>$2<\/b>''$3/ > /(^|[\s\(])\_([^ ].*?[^ ])\_([\s\)\.\,\:\;\!\?]|$)/$1''$2''$3/ > py> import re > py> splitter = re.compile(r'[^\\]/') > py> for p in perlish: > ... print splitter.split(p) > ... > ['/(^|[\\s\\(])\\*([^ ].*?[^ ])\\*([\\s\\)\\.\\,\\:\\;\\!\\?]|$', > "$1'''$2'''$", ''] > ['/(^|[\\s\\(])\\_\\_([^ ].*?[^ ])\\_\\_([\\s\\)\\.\\,\\:\\;\\!\\?]|$', > "$1''<b>$2<\\/b>''$", ''] > ['/(^|[\\s\\(])\\_([^ ].*?[^ ])\\_([\\s\\)\\.\\,\\:\\;\\!\\?]|$', > "$1''$2''$", ''] > > (I'm hoping this doesn't wrap!) > > James
I realized that threw away the closing parentheses. This is the correct version: py> splitter = re.compile(r'(?<!\\)/') py> for p in perlish: ... print splitter.split(p) ... ['', '(^|[\\s\\(])\\*([^ ].*?[^ ])\\*([\\s\\)\\.\\,\\:\\;\\!\\?]|$)', "$1'''$2'''$3", ''] ['', '(^|[\\s\\(])\\_\\_([^ ].*?[^ ])\\_\\_([\\s\\)\\.\\,\\:\\;\\!\\?]|$)', "$1''<b>$2<\\/b>''$3", ''] ['', '(^|[\\s\\(])\\_([^ ].*?[^ ])\\_([\\s\\)\\.\\,\\:\\;\\!\\?]|$)', "$1''$2''$3", ''] James -- http://mail.python.org/mailman/listinfo/python-list