On 28/08/2009 10:38 PM, Massa, Harald Armin wrote:
Skip,
self.unique_name = os.path.join(dirname,
"%s.%s%s" % (self.hostname,
tname,
self.pid))
raising a UnicodeDecodeError:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in
position 3: ordinal not in range(128)
on a WIndows system when the hostname contains non-ASCII data. (All the
other elements involved in the os.path.join call are ASCII or numbers.)
A quite common pain within the non-ASCII-World of Python on Windows
programming.
Actually, Windows is better than many in this regard - at least the
file-system has a canonical encoding!
The quickest soliution I found is to switch the sys.setdefaultencoding
to UTF-8, follower on slot 2 is to switch default-encoding to latin-1.
Both solutions are productive in real world projects for 5 (utf-8) and 9
years (latin-1) with various apps by me; and a similiar solution is
reported by Chris Withers for running productive for 5 years.
Both solutions were called being highly dangerous by MvL and others on
c.p.dev; esp. pointing out problems that can happen concerning hashes /
dict-keys not comparing as expected.
Yes, this is very dangerous advice. Chris is dealing with Zope which
has long been confused wrt unicode, but there simply is no single
encoding suitable as a default if you have code confused about unicode.
In skip's case, the closest advice to this should be to use the 'mbcs'
encoding - functions which are not unicode-aware (either in pywin32 or
python itself) which call the win32 api will return strings in the
'mbcs' encoding. Ideally we would work out where such a string is
originating and convert it to unicode as soon as it is received.
For example, consider the tempfile module:
>>> tempfile.gettempdir()
'c:\\users\\skip\\appdata\\local\\temp'
>>>
Note the result is a string and not unicode. If logged in with a user
with an extended char in their username, you may well find an invalid
ascii string is returned (I'm not at my main PC, so can't quickly
demonstrate). Attempting to pass such a string, along with a unicode
arg as received from some other unicode-aware function, to
os.path.joim() will then cause such an error. In that case the correct
solution would be to dec0de the string as soon as gettempdir() is
called, as only at that point can you be sure what the encoding is.
HTH,
Mark
_______________________________________________
python-win32 mailing list
python-win32@python.org
http://mail.python.org/mailman/listinfo/python-win32