Hi Michael and Kent, thanks to your tips I was able to solve my problems! It was quite easy at last.
For those interested and struggling with utf-8, ascii and unicode: After knowing the right way of - string.decode() upon input (if in question) - string.encode() upon output (more often then not) where input and output are reading and writing to files, file-like objects, databases... and functions of some not unicode-proof modules I got rid of all calls to encode() and decode() I made by trial and error and which messed it all up. Now I have just a few calls to encode() and voilá! xml.sax seems to read and decode the utf-8 encoded xml-file perfectly right, so do ZipFile.read() and file.write() - no encding oder decoding. To me it was very important to stress out that utf-8 ist *not* unicode, although I have already read about this topic (and you can read this advise often here at this list). On my system sys.stdout and sys.stderr seem to have a utf-8 and a None encoding, respectively (Kubuntu Linux, python2.4, ipython and konsole as terminal). The wrapper suggested by Kent sys.stdout = codecs.getwriter('utf-8')(sys.stdout, 'backslashreplace') sys.stderror = codecs.getwriter('ascii')(sys.stderror, 'backslashreplace') solves all my output problems regarding debugging. Thank you for your help! Dave P.s.: The quotations in my signature are by chance, really. Normally I'm not the kind of guy believing in prevision... ;) -- I never realized it before, but having looked that over I'm certain I'd rather have my eyes burned out by zombies with flaming dung sticks than work on a conscientious Unicode regex engine. -- Tim Peters, 3 Dec 1998
pgpwHHJ0xtmzY.pgp
Description: PGP signature
_______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor