> Contact: [EMAIL PROTECTED]
> Report Type: Other Question, Problem, or Feedback
>
> My problem is to recognize from the 32 bit value of unicode
> character if this is a chinese character or korean or japanese. How can do this?

You can tell if it is NOT from a legacy character set such as shift_jis or big5 by failing to convert it to that character set. Or you can look it up in unihan.txt <http://www.unicode.org/Public/UNIDATA/Unihan.txt> (25 megabytes, also at the ftp site). There are also Perl routines for getting at the information.


U+4E01 kAlternateKangXi 0075.003

JD



Reply via email to