Re: UTF-8 wakeup call

Keld Jďż˝rn Simonsen Sat, 07 Dec 2002 06:32:08 -0800

On Fri, Dec 06, 2002 at 05:10:34PM +0100, Kent Karlsson wrote:
> 
> > Then Unicode was
> > reluctantly persuaded to do 31-bit and later they were persuaded
> 
> This is a tainted description.  However, WG2 (10646) was "persuaded"
> to include UTF-16, so it goes both ways.


True, but Linux UTF-8 support does not build on UTF-16.

> > also to use the UTF-8. Very recently Unicode introduced UTF-32,
> > which is refelcting what has been using all the time.
> > The way 10646 is coming to Linux is also much
> > with the support from the ISO 14651 sorting standard 
> 
> 14651 is good, but has some flaws.  See Unicode standard annex 10
> (http://www.unicode.org/reports/tr10/) that avoids SOME (not all)
> of those flaws.

Maybe there are flaws in 14651, but it is ISO 14651 which is used in Linux.

> > and the ISO TR 14652 locale standard. 
> 
> 14652 is NOT a standard.  It is also very unlikely to ever develop into one.
> Keld, please stop promoting it as a standard, when you very well know
> that it is NOT a standard.

It is as much a standard as Unicode in the generic sense of the word
"standard", but it is not an ISO standard. Please understand that.

> > I think the proper way to characterize what we do now in Linux is
> > to say ISO 10646, and probably mention Unicode in parenthesis the first
> > time it appears. It should not be that difficult, we have been
> > referring ISO 8859 for a long time. So please use ISO 10646
> > in stead of the name Unicode when you refer to this in articles etc.
> 
> Unicode, however, provides a lot of data, algorithms, and hints that are
> not provided (adequately) by any ISO standard.  It therefore makes sense
> to refer to Unicode, and to use the Unicode character database data,
> http://www.unicode.org/ucd/, mapping tables (http://www.unicode.org/Public/MAPPINGS/)
> as well as algorithms specified by Unicode (such as the normalisation algorithm,
> http://www.unicode.org/reports/tr15/ and the BiDi algorithm,
> http://www.unicode.org/reports/tr9/).

The mappings used are at least also from the RFC 1345 (recode uses that) 
or the IS 15897 which uses many if the same names and mappings.
Specifically I have seen that Linux is *not* using the Unicode data
because of copyright issues. It would be much fairer in terms of
specifications and data used to say that Linux follows ISO standards
and specifications, and promote it a system implementing ISO 10646.

Kind regards
keld
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: UTF-8 wakeup call

Reply via email to