On Fri, Sep 21, 2001 at 04:52:51PM -0700, Paul Prescod wrote:
> > Urgh, this is tricky. Once you move outside of the BMP, the encodings you
> > *really* want to work stop working.
> 
> Don't follow.

UCS-2 is only defined for characters inside the Basic Multilingual Plane;
UTF-16 has to use surrogates for non-BMP characters, and that sucks too
because what used to be a nice fixed-width encoding has suddenly gone
variable-width on you. You didn't want that to happen. UTF-8 uses surrogates
two, which is screaming difficult to process.

> > Why will they bother screaming loud enough? Unicode doesn't do what they want
> > and JIS/SJIS/EUC/whatever does. 
> 
> But where do they get their software? 

Oh, I forget there are non-Unix platforms. :) I dunno what things like
Ichitaro use for a file format.

> other than Java internally with their recent APIs. So I'd like to know
> more about whether Japanese and Chinese people are really using
> something other than Unicode or whether they are just using variant
> encodings for data that their software treats internally as Unicode.

I have a very strong suspicion it depends on the nationality of the
programmer. :) (And we're supposed to be generating programming languages
for programmers...)

-- 
Resist the urge to start typing; thinking is a worthwhile alternative.
    -- Kernighan and Pike

Reply via email to