Kristján Valur Jónsson <krist...@ccpgames.com> added the comment:

> Yes, but in Python, U+DC80..D+DCFF range is used to store undecodable bytes. 
> Eg. 'abc\xff'.decode('ascii', 'surrogateescape') gives 'abc\udcff'.

That's an inventive way of breaking the unicode standard :)
Anyway, why would you worry about that?  My patch doesn't use "surrogateescape" 
so there is no problem.  There are only two places where I "decode":  
1) module names and sys.path components in the system file encoding:  If they 
contain undecodable characters, then that is an error.  No reason to propagate 
that error into the import machinery.
2) when decoding utf-8 back into unicode, but that utf-8 is already leagal 
since _we_ generated it.

If a _unicode_ input (sys.path) contains a valid surrogate pair, then the utf-8 
encoder just encodes it.
But if it finds a lone surrogate as you describe (python special) then that 
represends an undecodable chacater, something that should have been covered 
earlier and something we know nothing about.  Clearly, that makes that 
particular unicode sys.path component invalid.

(Hm, I notice that 2.7 happily encodes lone surrogates to utf-8)

> Python 2.7 is out and I think it is too late to fix Python2. Anyway, Python2 
> uses bytes for sys.path or other paths, so the problem only occurs if the 
> user 
> specifies unicode paths.
Which is precisely the case that it is designed to solve.  When the chinese 
user installs EVE Online in a weird folder, then that should work.
Also, 2.x is not quite dead yet.  There are quite a few people doing their own 
patches for their private purposes.  Although my patch won't go into any 
official version, there might be others in the same situation like us:  Trying 
to support an _embedded_ python 2.x version in an internationalized 
enverionment (on windows :)

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue1552880>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to