Re: [Python-Dev] transform() and untransform() methods, and the codec registry

Victor Stinner Sun, 05 Dec 2010 14:27:30 -0800

On Saturday 04 December 2010 09:31:04 you wrote:
> Alexander Belopolsky writes:
>  > In fact, once the language moratorium is over, I will argue that
>  > str.encode() and byte.decode() should deprecate encoding argument and
>  > just do UTF-8 encoding/decoding.  Hopefully by that time most people
>  > will forget that other encodings exist.  (I can dream, right?)
> 
> It's just a dream.  There's a pile of archival material, often on R/O
> media, out there that won't be transcoded any more quickly than the
> inscriptions on Tutankhamun's tomb.


Not only, many libraries expect use bytes arguments encoded to a specific 
encoding (eg. locale encoding). Said differenlty, only few libraries written in 
C accept wchar* strings.

The Linux kernel (or many, or all, UNIX/BSD kernels) only manipulate byte 
strings. The libc only accept wide characters for a few operations. I don't 
know how to open a file with an unicode path with the Linux libc: you have to 
encode it...

Alexander: you should first patch all UNIX/BSD kernels to use unicode 
everywhere, then patch all libc implementations, and then all libraries 
(written in C). After that, you can have a break.

Victor
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] transform() and untransform() methods, and the codec registry

Reply via email to