Julian 'Julik' Tarkhanov wrote:
> 
>  
> Python's unicode is actually UTF-16 


sorry, but no. it's not utf-16.

it's decided at compile-time,
and i'ts either utf-32 or utf-16.

on linux it's usually utf-32, and on windows it's usually (always?) utf-16.

but you should not care about it. you see, in python,
the unicode-strings are a separate data-type, and there's
just no way to take a bytestring, and tell python: "from now on,
you are an unicode-string, because i know that you are encoded in utf-16."

the way it works is that you take a bytestring,
and ask python to convert it into an unicode-string (and you also have 
to tell python the bytestring's charset).

so while it might be, that the conversion from utf-16-bytestrings to 
unicode is sometimes faster thatn converting from utf-8-bytestrings to 
unicode, you can't be sure, because as i wrote above, the internal 
unicode-encoding is not fixed.

> whereas IO and the databases mostly
> speak UTF-8 -
> so no, you can't dump it over the wire.

> We Rubyists are a tad happier
> because we now
> have all in UTF-8

you mean that regexes, and all the methods of the string-class now are 
unicode-aware in ruby? :)

gabor

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to