Albert-Jan Roskam wrote: > Hi, > > > With Python 3.5 under Windows I am using the logging module to log > messages to stdout (and to a file), but this occasionally causes logging > errors because some characters cannot be represented in the codepage used > by cmd.exe (cp850, aka OEM codepage, I think). What is the best way to > prevent this from happening? The program runs fine, but the error is > distracting. I know I can use s.encode(sys.stdout.encoding, 'replace') and > log that, but this is ugly and tedious to do when there are many log > messages. I also don't understand why %r (instead of %s) still causes an > error. I thought that the character representation uses only ascii > characters?!
Not in Python 3. You can enforce ascii with "%a": >>> euro = '\u20ac' >>> print("%r" % euro) '€' >>> print("%a" % euro) '\u20ac' Or you can set an error handler with PYTHONIOENCODING (I have to use something that is not utf-8-encodable for the demo): $ python3 -c 'print("\udc85")' Traceback (most recent call last): File "<string>", line 1, in <module> UnicodeEncodeError: 'utf-8' codec can't encode character '\udc85' in position 0: surrogates not allowed $ PYTHONIOENCODING=:backslashreplace python3 -c 'print("\udc85")' \udc85 Or you follow the convention and log to stderr: $ python3 -c 'import sys; print("\udc85", file=sys.stderr)' \udc85 $ $ python3 -c 'import logging; logging.basicConfig(); logging.getLogger().warn("\udc85")' > to_prove_it_s_not_stdout WARNING:root:\udc85 > import logging > import sys > > assert sys.version_info.major > 2 > logging.basicConfig(filename="d:/log.txt", > level=logging.DEBUG,format='%(asctime)s %(message)s') handler = > logging.StreamHandler(stream=sys.stdout) logger = > logging.getLogger(__name__) logger.addHandler(handler) > > s = '\u20ac' > logger.info("euro sign: %r", s) > > > > --- Logging error --- > Traceback (most recent call last): > File "c:\python3.5\lib\logging\__init__.py", line 982, in emit > stream.write(msg) > File "c:\python3.5\lib\encodings\cp850.py", line 19, in encode > return codecs.charmap_encode(input,self.errors,encoding_map)[0] > UnicodeEncodeError: 'charmap' codec can't encode character '\u20ac' in > position 12: character maps to <undefined> Call stack: > File "q:\temp\logcheck.py", line 10, in <module> > logger.info("euro sign: %r", s) > Message: 'euro sign: %r' > Arguments: ('\u20ac',) > > > Thanks in advance for your replies! > > > Albert-Jan > > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor