Bug? concatenate a number to a backreference: re.sub(r'(zzz:)xxx', r'\1'+str(4444), somevar)

2009-10-23 Thread abdulet
Well its this normal? i want to concatenate a number to a
backreference in a regular expression. Im working in a multprocess
script so the first what i think is in an error in the multiprocess
logic but what a sorprise!!! when arrived to this conclussion after
some time debugging i see that:

import re
aa = zzz:xxx
re.sub(r'(zzz:).*',r'\1'+str(),aa)
'[33'

¿?¿?¿? well lets put a : after the backreference

aa = zzz:xxx
re.sub(r'(zzz).*',r'\1:'+str(),aa)
'zzz:'

now its the expected result so
should i expect that python concatenate the string to the
backreference before substitute the backreference? or is a bug

tested on:
Python 2.6.2 (r262:71605, Apr 14 2009, 22:40:02) [MSC v.1500 32 bit
(Intel)] on win32
Python 2.5.2 (r252:60911, Jan  4 2009, 17:40:26) [GCC 4.3.2] on linux2

with the same result

Cheers!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug? concatenate a number to a backreference: re.sub(r'(zzz:)xxx', r'\1'+str(4444), somevar)

2009-10-23 Thread Peter Otten
abdulet wrote:

 Well its this normal? i want to concatenate a number to a
 backreference in a regular expression. Im working in a multprocess
 script so the first what i think is in an error in the multiprocess
 logic but what a sorprise!!! when arrived to this conclussion after
 some time debugging i see that:
 
 import re
 aa = zzz:xxx
 re.sub(r'(zzz:).*',r'\1'+str(),aa)
 '[33'

If you perform the addition you get r\1. How should the regular 
expression engine interpret that? As the backreference to group 1, 13, ... 
or 1? It picks something completely different, [33, because \133 is 
the octal escape sequence for [:

 chr(0133)
'['

You can avoid the ambiguity with

extra = str(number)
extra = re.escape(extra) 
re.sub(expr r\g1 + extra, text)

The re.escape() step is not necessary here, but a good idea in the general 
case when extra is an arbitrary string.

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Bug? concatenate a number to a backreference: re.sub(r'(zzz:)xxx', r'\1'+str(4444), somevar)

2009-10-23 Thread abdulet
On 23 oct, 13:54, Peter Otten __pete...@web.de wrote:
 abdulet wrote:
  Well its this normal? i want to concatenate a number to a
  backreference in a regular expression. Im working in a multprocess
  script so the first what i think is in an error in the multiprocess
  logic but what a sorprise!!! when arrived to this conclussion after
  some time debugging i see that:

  import re
  aa = zzz:xxx
  re.sub(r'(zzz:).*',r'\1'+str(),aa)
  '[33'

 If you perform the addition you get r\1. How should the regular
 expression engine interpret that? As the backreference to group 1, 13, ...
 or 1? It picks something completely different, [33, because \133 is
 the octal escape sequence for [:

  chr(0133)

 '['

 You can avoid the ambiguity with

 extra = str(number)
 extra = re.escape(extra)
 re.sub(expr r\g1 + extra, text)

 The re.escape() step is not necessary here, but a good idea in the general
 case when extra is an arbitrary string.

 Peter
Aha!!! nice thanks i don't see that part of the re module
documentation and it was in front of my eyes :(( like always its
something silly jjj so thanks again and yes!! is a nice idea to escape
the variable ;)

cheers
-- 
http://mail.python.org/mailman/listinfo/python-list