On Tue, Feb 19, 2002 at 11:57:11AM +0900, Dan Kogai wrote:
>   Other major codings that are missing is obviously CNS11643.  I don't 
> know much about it but so far as I know CNS11643 is ISO-2022 compliant 
> and CNS11643-1 and CNS11643-2 covers Big5.

Hmm?  It occurs to me that
http:[EMAIL PROTECTED]/msg60957.html

already has their map provided by SADAHIRO Tomoyuki; although the CNS tools in
Taiwan are proprietary, so I'll not be able to verify its accuracy beyond the
unicode.org reference map.

>   But as you see Encode:XX is so far dependent on Tcl encoding and there 
> is no CNS11643 there yet....

So, are there known problems with using the above maps?

> >Anyway, I'll get some more tests (and get GB working) when I wake up.
>   It does!  Now we are looking for testers of KR and CN as well.  Anyone?

GB2312(CN) is absolutely broken; it rejects any valid GB input I could muster
(including EUC-CN, HZ and GBK encodings); I suppose the original Tcl map
is broken as well, since it lists itself as a type D (double-byte) mapping,
but in practice it's almost always a M type encoding, with 0xA1-0xFE as 'lead'
bytes. GB12345 is similarily broken.

I'll see what I can do to regenerate their maps, either from 
http://www.unicode.org/Public/MAPPINGS/ or other official sources.

Thanks,
/Autrijus/

Attachment: msg00676/pgp00000.pgp
Description: PGP signature

Reply via email to