Thomas Jollans wrote:
On Wednesday 18 August 2010, it occurred to Brandon Harris to exclaim:
Having trouble using %s with re.sub

test = '/my/word/whats/wrong'
re.sub('(/)word(/)', r'\1\%s\2'%'1000', test)

return is /my/@0/whats/wrong


This has nothing to do with %, of course:

re.sub('(/)word(/)', r'\1\%d\2'%1000, test)
'/my/@0/whats/wrong'
re.sub('(/)word(/)', r'\1\1000\2', test)
'/my/@0/whats/wrong'

let's see if we can get rid of that zero:

re.sub('(/)word(/)', r'\1\100\2', test)
'/my/@/whats/wrong'

so '\100' appears to be getting replaced with '@'. Why?

'\100'
'@'

This is Python's way of escaping characters using octal numbers.

chr(int('100', 8))
'@'

How to avoid this? Well, if you wanted the literal backslash, you'll need to escape it properly:

print(re.sub('(/)word(/)', r'\1\\1000\2', test))
/my/\1000/whats/wrong

If you didn't want the backslash, then why on earth did you put it there? You have to be careful with backslashes, they bite ;-)

Anyway, you can simply do the formatting after the match.

re.sub('(/)word(/)', r'\1%d\2', test) % 1000
'/my/1000/whats/wrong'

Or work with match objects to construct the resulting string by hand.

You can stop group references which are followed by digits from turning
into octal escapes in the replacement template by using \g<n> instead:

>>> print r'\1%s' % '00'
\100
>>> print r'\g<1>%s' % '00'
\g<1>00
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to