Okay, for specifics which demonstrate the problem.
I have a directory, C:\tmp\腌
In it, there is a file, doo.py
>d = os.listdir(u"c:/tmp")[-1]
>d
u'\u814c'
>>> d2 = os.listdir(u"c:/tmp/"+d)
>>> d2
[u'doo.py']
>>> p = u"c:/tmp/"+d
>>> p
u'c:/tmp/\u814c'
>>> sys.path.append(p)
>>> import doo
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named doo

>>> p.encode("mbcs")
'c:/tmp/?'
>>> p.encode("gb2312")
'c:/tmp/\xeb\xe7'

Running your example test code gives:
Prefixes: C:\PyDev25 C:\PyDev25
Path: ['c:\\tmp', 'c:\\documents and settings\\kristjan\\my documents\\python',
'C:\\PyDev25\\PCbuild8\\python25.zip', 'C:\\PyDev25\\DLLs', 'C:\\PyDev25\\lib',
'C:\\PyDev25\\lib\\plat-win', 'C:\\PyDev25\\lib\\lib-tk', 'C:\\PyDev25\\PCbuild8
', 'C:\\PyDev25', 'C:\\PyDev25\\lib\\site-packages']
Default encoding: ascii
Input encoding: cp850 Output encodings: cp850 cp850

-----Original Message-----
From: Nick Coghlan [mailto:[EMAIL PROTECTED] 
Sent: 17. júní 2006 04:17
To: Phillip J. Eby
Cc: Kristján V. Jónsson; Python Dev
Subject: Re: [Python-Dev] unicode imports

Phillip J. Eby wrote:
> Actually, you would want to put it in sys.path_hooks, and then 
> instances would be placed in path_importer_cache automatically.  If 
> you are adding it to the path_hooks after the fact, you should simply 
> clear the path_importer_cache.  Simply poking stuff into the 
> path_importer_cache is not a recommended approach.

Oh, I agree - poking it in directly was a desperation measure if the path_hooks 
machinery didn't like Unicode either.

I've since gone and looked, and you may be screwed either way - the standard 
import paths appear to be always put on the system path as encoded 8-bit 
strings, not as Unicode objects.

That said, it also appears that the existing machinery *should* be able to 
handle non-ASCII path items, so long as 'Py_FileSystemDefaultEncoding' is set 
correctly. If it isn't handling it, then there's something else going wrong.

Modules/getpath.c and friends don't encode the results returned by the platform 
APIs, so the strings in

Kristján, can you provide more details on the fault you get when trying to 
import from the path containing the Chinese characters? Specifically:

What is the actual file system path?
What do sys.prefix, sys.exec_prefix and sys.path contain?
What does sys.getdefaultencoding() return?
What do sys.stdin.encoding, sys.stdout.encoding and sys.stderr.encoding say?
What does "python -v" show?
Does adding the standard lib directories manually to sys.path make any 
difference?
Does setting PYTHONHOME to the appropriate settings make any difference?

Running something like the following would be good:

   import sys
   print "Prefixes:", sys.prefix, sys.exec_prefixes
   print "Path:", sys.path
   print "Default encoding:", sys.getdefaultencoding()
   print "Input encoding:", sys.stdin.encoding,
   print "Output encodings:", sys.stdout.encoding, sys.stderr.encoding
   try:
       import string # Make python -v do something interesting
   except ImportError:
       print "Could not find string module"
   sys.path.append(u"stdlib directory name")
   try:
       import string # Make python -v do something interesting
   except ImportError:
       print "Could not find string module"






-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to