On Tue, Apr 7, 2009 at 12:52 PM, Pirritano, Matthew <mpirrit...@ochca.com> wrote: > So Kent's syntax worked to convert my Unicode file to plain text. But > now my data is double space. How can I fix this. Here is the code I'm > using. > > import codecs > > inp = codecs.open('g:\\data\\amm\\text files\\test20090320.txt', 'r', > 'utf-16') > outp = open('g:\\data\\amm\\text files\\new_text_file.txt', 'w') > outp.writelines(inp) > inp.close() > outp.close()
I guess there is something funny going on with conversion of newlines. It would help to know what line endings are in the original data, and what are in the new data. One thing to try is to open the output file as binary - 'wb' instead of 'w'. The input file is opened as binary by the codecs module. If that doesn't work, you could try tostrip line endings from the original, then add back in to the new file: inp = codecs.open('g:\\data\\amm\\text files\\test20090320.txt', 'r', 'utf-16') outp = open('g:\\data\\amm\\text files\\new_text_file.txt', 'w') for line in inp: line = line.rstrip() outp.write(line) outp.write('\n') inp.close() outp.close() Note that this will strip all trailing white space from the input, I don't know if that is an issue... Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor