JFYI:

-----Original Message-----
From: Yung-Fong Tang <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED] <[EMAIL PROTECTED]>;
[EMAIL PROTECTED] <[EMAIL PROTECTED]>
Date: 25 May 2001  21:53
Subject: Re: string apis and UAX #15: Unicode Normalization



BTW, ICU from IBM
http://oss.software.ibm.com/developerworks/opensource/icu/project/
already have an open source implementation of the normalization code

IBM already submit some ICU code to us for Unicode Bidi algorithm and
Arabic Shaping. I believe mkaply (the mozilla OS/2 Sir.) can help us to
make IBM submit the code for mozilla if you are willing to take it.

mkaply- do you know where is the normalization code in icu ?

Yung-Fong Tang wrote:

> Dear scc:
>
> Any plan to add Unicode normalization form (
> http://www.unicode.org/unicode/reports/tr15/
>  ) into your yet-not-freezed string api ?
> Unicode normalization will be very important for new web
> protocols/format/specifiction- for example
>
>    * xpath see http://www.w3.org/TR/xpath 3.6 Strings
>    * Character Model for the World Wide Web 1.0
>      http://www.w3.org/TR/charmod/
>    * Document Object Model (Core) Level 1
>      ttp://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html 1.1.6.
>      Case sensitivity in the DOM
>    * Internationalized Resource Identifiers (IRI)
>      http://www.ietf.org/internet-drafts/draft-masinter-url-i18n-07.txt
>      see 2.3 Mapping of IRIs to URIs
>    *
>
> Frank Tang
>
>  [Unicode]   Technical Reports
>                       Unicode Standard Annex #15
> Unicode Normalization Forms
> Version   Unicode 3.1.0

Authors:   Mark Davis ([EMAIL PROTECTED]), Martin Duerst ([EMAIL PROTECTED])
Date:      2001-03-23
This Version:     http://www.unicode.org/unicode/reports/tr15/tr15-21.html
Previous  Version: http://www.unicode.org/unicode/reports/tr15/tr15-19.html
Latest Version:   http://www.unicode.org/unicode/reports/tr15
Tracking Number: 20
>
> Summary
>
> This document describes specifications for four normalized forms of
> Unicode text. With these forms, equivalent text (canonical or
> compatibility) will have identical binary representations. When
> implementations keep strings in a normalized form, they can be assured
> that equivalent strings have a unique binary representation.
>
>      Note: Unicode 3.1 introduces a change that may affect
>      backwards compatibility for some implementations; for
>      details see Annex 12: Unicode 3.1 Normalization Corrigendum.
>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to