Re: FW: Web Form: Other Question: CJK

John Delacour Sat, 11 Oct 2003 10:35:20 -0700

> Contact: [EMAIL PROTECTED] > Report Type: Other Question, Problem, or Feedback > > My problem is to recognize from the 32 bit value of unicode > character if this is a chinese character or korean or japanese. How can do this?

You can tell if it is NOT from a legacy character set such as shift_jis or big5 by failing to convert it to that character set. Or you can look it up in unihan.txt <http://www.unicode.org/Public/UNIDATA/Unihan.txt> (25 megabytes, also at the ftp site). There are also Perl routines for getting at the information.

U+4E01 kAlternateKangXi 0075.003

JD

Re: FW: Web Form: Other Question: CJK

Reply via email to