On Fri, 09 May 2014 12:51:04 -0700, scottcabit wrote: > Hi, > > here is a snippet of code that opens a file (fn contains the path\name) > and first tried to replace all endash, emdash etc characters with > simple dash characters, before doing a search. > But the replaces are not having any effect. Obviously a syntax > problem....wwhat silly thing am I doing wrong?
You're making the substitution, then throwing the result away. And you're using a nuclear-powered bulldozer to crack a peanut. This is not a job for regexes, this is a job for normal string replacement. > fn = 'z:\Documentation\Software' > def processdoc(fn,outfile): > fStr = open(fn, 'rb').read() > re.sub(b'‒','-',fStr) Good: fStr = re.sub(b'‒', b'-', fStr) Better: fStr = fStr.replace(b'‒', b'-') But having said that, you actually can make use of the nuclear-powered bulldozer, and do all the replacements in one go: Best: # Untested fStr = re.sub(b'&#x(201[2-5])|(2E3[AB])|(00[2A]D)', b'-', fStr) If you're going to unload the power of regexes, unload them on something that makes it worthwhile. Replacing a constant, fixed string with another constant, fixed string does not require a regex. -- Steven D'Aprano http://import-that.dreamwidth.org/ -- https://mail.python.org/mailman/listinfo/python-list