On Jul 15, 2005, at 5:06 PM, Brian Kirsch wrote:

Hi Andrea,
Thank for the feedback. Comments are included incline:


I18n G.A.L. wrote:

Brian, et al,
I'm not able to do a whole lot online, but I've read over the plan. Here are my comments: 1. In general, it's best to specify the character encoding scheme (or form) directly. "Unicode" can mean UTF-8, UTF-16 (BE or LE), or UTF-32. I recommend using UTF-8 wherever possible, unless working within a 16-bit oriented environment (such as Java), where I'd recommend UTF-16.


When referring to unicode lower case I am talking about Python's unicode object which can be utf-16 (BE or LE) or utf-32 depending on the platform it was compiled on.

It's actually UCS-2 or UCS-4, which, at least for the time being, makes making a 100% functional ICU wrapper impossible using the built-in python unicode object (see python-dev archives from a few months back). Also, I would recommend using UTF-16 as your standard encoding if at all possible, as it avoids lots of nasty encoding problems, as well as being a nice space compromise for almost any language whose characters aren't a subset of Latin-1.

--
Nick

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Open Source Applications Foundation "Dev" mailing list
http://lists.osafoundation.org/mailman/listinfo/dev

Reply via email to