Okay, for specifics which demonstrate the problem.
I have a directory, C:\tmp\腌
In it, there is a file, doo.py
>d = os.listdir(u"c:/tmp")[-1]
>d
u'\u814c'
>>> d2 = os.listdir(u"c:/tmp/"+d)
>>> d2
[u'doo.py']
>>> p = u"c:/tmp/"+d
>>> p
u'c:/tmp/\u814c'
>>> sys.path.append(p)
>>> import doo
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named doo
>>> p.encode("mbcs")
'c:/tmp/?'
>>> p.encode("gb2312")
'c:/tmp/\xeb\xe7'
Running your example test code gives:
Prefixes: C:\PyDev25 C:\PyDev25
Path: ['c:\\tmp', 'c:\\documents and settings\\kristjan\\my documents\\python',
'C:\\PyDev25\\PCbuild8\\python25.zip', 'C:\\PyDev25\\DLLs', 'C:\\PyDev25\\lib',
'C:\\PyDev25\\lib\\plat-win', 'C:\\PyDev25\\lib\\lib-tk', 'C:\\PyDev25\\PCbuild8
', 'C:\\PyDev25', 'C:\\PyDev25\\lib\\site-packages']
Default encoding: ascii
Input encoding: cp850 Output encodings: cp850 cp850
-----Original Message-----
From: Nick Coghlan [mailto:[EMAIL PROTECTED]
Sent: 17. júní 2006 04:17
To: Phillip J. Eby
Cc: Kristján V. Jónsson; Python Dev
Subject: Re: [Python-Dev] unicode imports
Phillip J. Eby wrote:
> Actually, you would want to put it in sys.path_hooks, and then
> instances would be placed in path_importer_cache automatically. If
> you are adding it to the path_hooks after the fact, you should simply
> clear the path_importer_cache. Simply poking stuff into the
> path_importer_cache is not a recommended approach.
Oh, I agree - poking it in directly was a desperation measure if the path_hooks
machinery didn't like Unicode either.
I've since gone and looked, and you may be screwed either way - the standard
import paths appear to be always put on the system path as encoded 8-bit
strings, not as Unicode objects.
That said, it also appears that the existing machinery *should* be able to
handle non-ASCII path items, so long as 'Py_FileSystemDefaultEncoding' is set
correctly. If it isn't handling it, then there's something else going wrong.
Modules/getpath.c and friends don't encode the results returned by the platform
APIs, so the strings in
Kristján, can you provide more details on the fault you get when trying to
import from the path containing the Chinese characters? Specifically:
What is the actual file system path?
What do sys.prefix, sys.exec_prefix and sys.path contain?
What does sys.getdefaultencoding() return?
What do sys.stdin.encoding, sys.stdout.encoding and sys.stderr.encoding say?
What does "python -v" show?
Does adding the standard lib directories manually to sys.path make any
difference?
Does setting PYTHONHOME to the appropriate settings make any difference?
Running something like the following would be good:
import sys
print "Prefixes:", sys.prefix, sys.exec_prefixes
print "Path:", sys.path
print "Default encoding:", sys.getdefaultencoding()
print "Input encoding:", sys.stdin.encoding,
print "Output encodings:", sys.stdout.encoding, sys.stderr.encoding
try:
import string # Make python -v do something interesting
except ImportError:
print "Could not find string module"
sys.path.append(u"stdlib directory name")
try:
import string # Make python -v do something interesting
except ImportError:
print "Could not find string module"
--
Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia
---------------------------------------------------------------
http://www.boredomandlaziness.org
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com