[forwarded to list] Le Tue, 7 Apr 2009 12:23:33 +0530, Kumar <hihir...@gmail.com> s'exprima ainsi:
> Hi denis, > > Thanks a lot for the reply. > > Actually on our web application when we display the data, at that time we do > parsing and make hyperlinks (through <a>) wherever possible. so if there is > any url like (http://www.hello.com) then while displaying data we convert it > to <a href="http://www.hello.com">http://www.hello.com</a> > and if we find any account number then we make them to go to our default > account page > e.g. text is "I am using 12345-45". then while viewing we replace it with > following. > I am using <a href="http://helloc.com/accid/12345-45">12345-45</a> > > I hope above example would clear your problem. > > now my problem is first i try to convert all existing link to <a> tag. this > work perfectly fine. > so e.g. the value is "I am using this url http://hello.com/accid/12345-45" > then as per above algorithm it works perfectly find and change it to > following. > I am using this url <a href="http://hello.com/accid/12345-45"> > http://hello.com/accid/12345-45</a> > now after that i again replace all accids to convert into url so above value > become followign > I am using this url <a href="http://hello.com/accid=<a href=" > http://hello.com/accid/12345-45">12345-45</a>">http://hello.com/accid=<a > href="http://hello.com/accid/12345-45">12345-45<a></a> > > and the complete link is messed up. > so while converting the accids into url i want to exclude the values which > start with http (e.g. http://hello.com/accid/12345-45) > > i hope it becomes more clear now. > one solution i have is i can exclude the accids start with / i.e. / > http://hello.com/accid/12345-45 but that is not perfect solution. > > Any pointer would be very helpful. > > Thanks, > Kumar Ok, now I understand. You need to convert both url-s and account numbers to html encoded links. Whatever the order you choose, numbers will be double-encoded. My solution (maybe not the best) would be to process both operations in one go, using a pattern that matches both and a smarter link writer func able to distinguish an url from a number. Pseudo code: toLinkPattern = re.compile("(urlFormat)|(accountNumFormat)") def toLink(match): string = match.group() if isAccountNum(string): return accountNumToLink(string) return urlToLink(string) result = toLinkPattern.sub(toLink, source) To make things easier, note that using groups() instead or group() will also tell you what kind of thing has been matched due to the position in match tuple. EG: import re pat = re.compile("([1-9])|([a-z])") print pat.findall("a1b2c") def replace(match): print match.group(), match.groups() (digit, letter) = (match.groups()[0],match.groups()[1]) print "digit:%s letter:%s" %(digit,letter) if digit is not None: return "0" return '@' print pat.sub(replace,"a1b2c") ==> [('', 'a'), ('1', ''), ('', 'b'), ('2', ''), ('', 'c')] a (None, 'a') digit:None letter:a 1 ('1', None) digit:1 letter:None b (None, 'b') digit:None letter:b 2 ('2', None) digit:2 letter:None c (None, 'c') digit:None letter:c @0...@0@ You can also use named groups: pat = re.compile("(?P<digit>\d)|(?P<letter>[a-z])") def replace(match): digit,letter = (match.group('digit'),match.group('letter')) print "digit:%s letter:%s" %(digit,letter) if digit is not None: # or better directly: if match.group('digit') is not None: return "0" return '@' print pat.sub(replace,"a1b2c") ==> digit:None letter:a digit:1 letter:None digit:None letter:b digit:2 letter:None digit:None letter:c @0...@0@ Denis ------ la vita e estrany _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor