JFYI: -----Original Message----- From: Yung-Fong Tang <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] <[EMAIL PROTECTED]>; [EMAIL PROTECTED] <[EMAIL PROTECTED]> Date: 25 May 2001 21:53 Subject: Re: string apis and UAX #15: Unicode Normalization BTW, ICU from IBM http://oss.software.ibm.com/developerworks/opensource/icu/project/ already have an open source implementation of the normalization code IBM already submit some ICU code to us for Unicode Bidi algorithm and Arabic Shaping. I believe mkaply (the mozilla OS/2 Sir.) can help us to make IBM submit the code for mozilla if you are willing to take it. mkaply- do you know where is the normalization code in icu ? Yung-Fong Tang wrote: > Dear scc: > > Any plan to add Unicode normalization form ( > http://www.unicode.org/unicode/reports/tr15/ > ) into your yet-not-freezed string api ? > Unicode normalization will be very important for new web > protocols/format/specifiction- for example > > * xpath see http://www.w3.org/TR/xpath 3.6 Strings > * Character Model for the World Wide Web 1.0 > http://www.w3.org/TR/charmod/ > * Document Object Model (Core) Level 1 > ttp://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html 1.1.6. > Case sensitivity in the DOM > * Internationalized Resource Identifiers (IRI) > http://www.ietf.org/internet-drafts/draft-masinter-url-i18n-07.txt > see 2.3 Mapping of IRIs to URIs > * > > Frank Tang > > [Unicode] Technical Reports > Unicode Standard Annex #15 > Unicode Normalization Forms > Version Unicode 3.1.0 Authors: Mark Davis ([EMAIL PROTECTED]), Martin Duerst ([EMAIL PROTECTED]) Date: 2001-03-23 This Version: http://www.unicode.org/unicode/reports/tr15/tr15-21.html Previous Version: http://www.unicode.org/unicode/reports/tr15/tr15-19.html Latest Version: http://www.unicode.org/unicode/reports/tr15 Tracking Number: 20 > > Summary > > This document describes specifications for four normalized forms of > Unicode text. With these forms, equivalent text (canonical or > compatibility) will have identical binary representations. When > implementations keep strings in a normalized form, they can be assured > that equivalent strings have a unique binary representation. > > Note: Unicode 3.1 introduces a change that may affect > backwards compatibility for some implementations; for > details see Annex 12: Unicode 3.1 Normalization Corrigendum. > - Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
