In message <[EMAIL PROTECTED]> Asmus Freytag <[EMAIL PROTECTED]> wrote:
> On top of that, it looks like 950 maps a bogus symbol or punctuation > character to U+2574. (2574 is one of a set of 4, and only 1 is mapped for > starters. Fonts covering CP950 give a way different image for that > character than you'd expect from either the charts or the names... I recently had to sort out our systems' Big5<->Unicode mapping table, and there seems to be great confusion in the punctuation space. The table (that used to be) on the Unicode site was unsatisfactory, and Microsoft's CP950 mapping also doesn't seem to make sense (eg with that U+2574 mapping, and CIRCLED PLUS and DOT OPERATOR instead of EARTH and SUN). One point of note is that there are a whole cluster of characters in the compatibility area of Unicode from U+FE30 to U+FE6B that are designed to handle mapping CNS11643, whose punctuation area is almost identical to Big5's. Mapping tables I've seen don't make proper use of them. I was able to come up with a good Big5 mapping by taking the best ideas from various Big5 and CNS11643 tables on the net, then making sure each of those Unicode compatibility characters was used once, AND IN THE ORDER THEY APPEAR IN UNICODE. This ends up mapping A15A to U+FE58 SMALL EM DASH, which still might not be right, but it looks like a confused character anyway - it appears different in Big5 and CNS11643 tables, so it could just be a glyph variant issue. -- Kevin Bracey, Principal Software Engineer Pace Micro Technology plc Tel: +44 (0) 1223 518566 645 Newmarket Road Fax: +44 (0) 1223 518526 Cambridge, CB5 8PB, United Kingdom WWW: http://www.pace.co.uk/