On Tue, Feb 19, 2002 at 11:57:11AM +0900, Dan Kogai wrote: > Other major codings that are missing is obviously CNS11643. I don't > know much about it but so far as I know CNS11643 is ISO-2022 compliant > and CNS11643-1 and CNS11643-2 covers Big5.
Hmm? It occurs to me that http:[EMAIL PROTECTED]/msg60957.html already has their map provided by SADAHIRO Tomoyuki; although the CNS tools in Taiwan are proprietary, so I'll not be able to verify its accuracy beyond the unicode.org reference map. > But as you see Encode:XX is so far dependent on Tcl encoding and there > is no CNS11643 there yet.... So, are there known problems with using the above maps? > >Anyway, I'll get some more tests (and get GB working) when I wake up. > It does! Now we are looking for testers of KR and CN as well. Anyone? GB2312(CN) is absolutely broken; it rejects any valid GB input I could muster (including EUC-CN, HZ and GBK encodings); I suppose the original Tcl map is broken as well, since it lists itself as a type D (double-byte) mapping, but in practice it's almost always a M type encoding, with 0xA1-0xFE as 'lead' bytes. GB12345 is similarily broken. I'll see what I can do to regenerate their maps, either from http://www.unicode.org/Public/MAPPINGS/ or other official sources. Thanks, /Autrijus/
msg00676/pgp00000.pgp
Description: PGP signature