Thanks Kushal and Steve. I think it works,a I say "I think" because at the results I got a strange character instead of the letter that should appear
this is my regexp: contents = re.sub(r'(<u>|<span style="text-decoration: underline;">)(l|L|n|N|t|T)(</span>|</u>)', '\2\'' ,contents) this is my input file content: <u>l</u>omo <u>n</u>omo <u>t</u>omo <u>L</u>omo <u>N</u>omo <u>T</u>omo <span style="text-decoration: underline;">n</span>omo <u>t</u>omo this is my output file content 'omo 'omo 'omo 'omo 'omo 'omo 'omo 'omo at to head of the file I got: #!/usr/bin/env python # -*- coding: utf-8 -*- I tried changing the coding to iso-8859-15, but nothing, for sure you know the reason for this, can you share it with this poor newbee" Thanks a lot!! On Wed, March 30, 2011 09:46, Kushal Kumaran wrote: 2011/3/30 "Andrés ChandÃa" <and...@chandia.net>: > > > I'm new to this list, so hello everybody!. > Hello Andrés > The stuff: > > I'm working with > regexps and this is my line: > > contents = re.sub("<u>l<\/u>", > "le" ,contents) > > in perl there is a way to reference previous registers, > i.e. > > $text =~ s/<u>(l|L|n|N)<\/u>/$1e/g; > > So I'm looking for > the way to do it in python, obviously this does not works: > > contents = > re.sub("<u>(l|L|n|N)<\/u>", "$1e", contents) > You will use \1 for the backreference. The documentation of the re module (http://docs.python.org/library/re.html#re.sub) has an example. Also note the use of raw strings (r'...') to avoid having to escape the backslash with another backslash. _______________________ andrés chandía P No imprima innecesariamente. ¡Cuide el medio ambiente! _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor