[EMAIL PROTECTED] wrote: > using this code: > > import re > s = 'HelloWorld19-FooBar' > s = re.sub(r'([A-Z]+)([A-Z][a-z])', "\1_\2", s) > s = re.sub(r'([a-z\d])([A-Z])', "\1_\2", s) > s = re.sub('-', '_', s) > s = s.lower() > print "s: %s" % s > > i expect to get: > hello_world19_foo_bar > > but instead i get: > hell☺_☻orld19_fo☺_☻ar > > (in case the above doesn't come across the same, it's: > hellX_Yorld19_foX_Yar, where X is a white smiley face and Y is a black > smiley face !!) > > is this a bug, or am i doing something wrong? >
Tim's given you the solution to the problem: with the re module, *always* use raw strings in regexes and substitution strings. Here's a simple diagnostic tool that you can use when the visual presentation of a result leaves you wondering [did you get smiley faces on Windows in IDLE? on Linux?]: |>>> print repr(s) 'hell\x01_\x02orld19_fo\x01_\x02ar' |>>> print "s: %r" % s s: 'hell\x01_\x02orld19_fo\x01_\x02ar' HTH, John -- http://mail.python.org/mailman/listinfo/python-list