> Hi, > > I'm trying to make a UCM file to feed to enc2xs. The legacy encoding for > Taiwanese romanization *must* have its code points mapped to Unicode > character sequences, for the simple reason that the UCS lacks the > corresponding precomposed characters (and is unlikely to have them in the > future, as they are composable using existing characters from the Latin > script and the Diacritical Combining Marks blocks). (See [1] for script > details.) (snip) > How does enc2xs deal with (or intend to deal with) such a case? Is the ICU > specification to be followed rigidly? > > Since I am very new to Perl, .any insight is appreciated. > > [1] http://lomaji.com/poj/chart.html > [2] http://oss.software.ibm.com/icu/userguide/conversion-data.html > > --Henry H. Tan-Tenn
Anyway your chart lacks code points for "the legacy encoding". Is any mapping table you intend available? (But I wonder why "the legacy encoding" is required, although UTF encodings for Unicode are available.) SADAHIRO Tomoyuki