Tim Michelsen wrote: > Hello, > I want to process some files encoded in latin-1 (iso-8859-1) in my > python script that I write on Ubuntu which has UTF-8 as standard encoding.
Not sure what you mean by "standard encoding" (is this an Ubuntu thing?) but essentially whenever you're pulling stuff into Python which is encoded and which you want to treat as Unicode, you need to decode it explicitly, either on a string-by-string basis or by using the codecs module to treat the whole of a file as encoded. In this case, assuming you have files in iso-8859-1, something like this: <code> import codecs filenames = ['a.txt', 'b.txt', 'c.txt'] for filename in filenames: f = codecs.open (filename, encoding="iso-8859-1") text = f.read () # # If you want to re-encode this -- not sure why -- # you could do this: # text = text.encode ("utf-8") print repr (text) </code> TJG _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor