All the code is untested, but should give you the idea. [EMAIL PROTECTED] writes:
> Hi all! > > I have a file in which there are some expressions such as "kindest > regard" and "yours sincerely". I must create a phyton script that > checks if a text contains one or more of these expressions and, in > this case, replaces the spaces in the expression with the character > "_". For example, the text > > Yours sincerely, Marco. > > Must be transformated in: > > Yours_sincerely, Marco. > > Now I have written this code: > > filemw = codecs.open(sys.argv[1], "r", "iso-8859-1").readlines() > filein = codecs.open(sys.argv[2], "r", "iso-8859-1").readlines() > > mw = "" > for line in filemw: > mw = mw + line.strip() + "|" One "|" too many. Generally, use join instead of many individual string +s. mwfind_re_string = "(%s)" % "|".join(line.strip() for line in filemw) > mwfind_re = re.compile(r"^(" + mw + ")",re.IGNORECASE|re.VERBOSE) mwfind_re = re.compile(mwfind_re_string),re.IGNORECASE) > mwfind_subst = r"_" > > for line in filein: That doesn't work. What about "kindest\nregard"? I think you're best of reading the whole file in (don't forget to close the files, BTW). > line = line.strip() > if (line != ""): > line = mwfind_re.sub(mwfind_subst, line) > print line > > It correctly identifies the expressions, but doesn't replace the > character in the right way. How can I do what I want? Use the fact that you can also use a function as a substitution. print mwfind_re.sub(lambda match: match.group().replace(' ','_'), "".join(line.strip() for line in filein)) 'as -- http://mail.python.org/mailman/listinfo/python-list