I've been creating a new set of CNS11643-1992 <-> Unicode mapping tables based on Unihan-3.1.1's kIRG_TSource tag, and have come across a few glitches.
Firstly, there is a typo in Unihan-3.1.1.txt. Compatibility ideograph U+2F958 has its TSource listed as 6-4627, which clashes with U+28E84. U+2F958 is clearly not correct - its TSource should be 6-4267. Apart from that, it would appear that the CNS<->Unicode mapping specified by Unihan is still not quite complete. It seems to me that a complete round-trip mapping is the intent of Unicode 3.1, although I haven't seen it explicitly stated. I have 19 characters in planes 3-7 that don't show up in the kIRG_TSource mapping: Gap in plane 3 at 65/72 (6168) Gap in plane 4 at 02/59 (225B) Gap in plane 4 at 03/65 (2361) Gap in plane 4 at 07/74 (276A) Gap in plane 4 at 08/07 (2827) Gap in plane 4 at 08/93 (287D) Gap in plane 4 at 10/78 (2A6E) Gap in plane 4 at 16/34 (3042) Gap in plane 4 at 24/60 (385C) Gap in plane 4 at 35/46 (434E) Gap in plane 4 at 36/56 (4458) Gap in plane 4 at 67/25 (6339) Gap in plane 4 at 69/63 (655F) Gap in plane 5 at 03/43 (234B) Gap in plane 5 at 85/76 (756C) Gap in plane 6 at 10/01 (2A21) Gap in plane 6 at 60/15 (5C2F) Gap in plane 7 at 12/26 (2C3A) Gap in plane 7 at 33/57 (4159) The other 48,008 ideographs are round-tripped, with the assistance of the CJK Compatibility Ideographs Supplement. Can anyone enlighten me as to the status of the missing 19? I'm also not clear on the status of plane 15. Is it really part of CNS 11642-1992? -- Kevin Bracey, Principal Software Engineer Pace Micro Technology plc Tel: +44 (0) 1223 518566 645 Newmarket Road Fax: +44 (0) 1223 518526 Cambridge, CB5 8PB, United Kingdom WWW: http://www.pace.co.uk/

