Re: [Tutor] problem in replacing regex

Moos Heintzen Wed, 08 Apr 2009 15:06:16 -0700

Hi,

You can do the substitution in many ways.


You can first search for bare account numbers and substitute them with
urls. Then substitute urls into <a></a> tags.

To substitute account numbers that aren't in urls, you simply
substitutes account numbers if they don't start with a "/", as you
have been trying to do.

re.sub() can accept a function instead of a string. The function
receives the match object and returns a replacement. This way you can
do extra processing to matches.

import re

text = """https://hello.com/accid/12345-12

12345-12

http://sadfsdf.com/asdf/asdf/asdf/12345-12

start12345-12end

this won't be replaced
start/123-45end
"""

def sub_num(m):
        if m.group(1) == '/':
                return m.group(0)
        else:
                # put url here
                return m.group(1) + 'http://example.com/' + m.group(2)

>>> print re.sub(r'(\D)(\d+-\d+)', sub_num , text)
https://hello.com/accid/12345-12

http://example.com/12345-12

http://sadfsdf.com/asdf/asdf/asdf/12345-12

starthttp://example.com/12345-12end

this won't be replaced
start/123-45end

>>> _

This is assuming there isn't any <a> tags in the input, so you should
do this before substituting urls into <a> tags.


I have super cow powers!

Moos
_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] problem in replacing regex

Reply via email to