On Thursday, February 13, 2003, at 07:18 AM, Marco Cimarosti wrote:

	3) All other characters listed in Unihan.txt are *both*
"Traditional" and "Simplified".

Actually, this is not quite true. Even though the current set of traditional/simplified data is much better than it's ever been, we still have cases where new simplified forms have been created and encoded where their traditional counterparts have not, and considerably more cases where traditional forms have theoretical simplifications which have not been encoded.

The best you can say is that if a character has a traditional variant (but no simplified variant), it's simplified, and if it has a simplified variant (and no traditional variant), it's traditional, and if it has both, it's both.

Anyway, I don't see how this information could be of any use for any
purpose...

There are some ideographs (e.g., anything with the bone radical) which have different appearance in simplified and traditional Chinese, even though the two have been unified in Unicode. Identifying a text as simplified vs. traditional could help in automatic font selection.

==========
John H. Jenkins
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://www.tejat.net/


Reply via email to