Rick McGowan scripsit:
> The Unicode Technical Committee has posted a new issue for public
> review and comment. Details are on the following web page:
> 
>       http://www.unicode.org/review/

I have prepared a draft DiacriticFolding.txt file for this issue; it is
temporarily available at http://www.ccil.org/~cowan/DiacriticFolding.txt .
This was prepared by looking for lines in UnicodeData that matched
the regex '(GREEK|LATIN|CYRILLIC|HEBREW).*WITH'.  (I added Hebrew to the
set of scripts specified by the current draft of #30.)

Characters with decompositions were mapped into the base character of the
decomposition; characters without decompositions were mapped by name.
The file http://www.ccil.org/~cowan/DiacriticFoldingExceptions.txt contains
a list of 32 characters matching the pattern which did not seem to me
to be suitable for diacritic folding.

I have posted a short version of this note to the Unicode comment form.

Comments?

-- 
A rabbi whose congregation doesn't want         John Cowan
to drive him out of town isn't a rabbi,         http://www.ccil.org/~cowan
and a rabbi who lets them do it                 [EMAIL PROTECTED]
isn't a man.    --Jewish saying                 http://www.reutershealth.com

Reply via email to