Hi again! :-) On Wed, Mar 20, 2002 at 02:43:49PM +0800, James Su wrote: > Hi, > I have done the HKSCS-2001 patchs for glibc, XFree86 and qt-2.3.x. But I > have no enough time to test them. And the most important, I have no ttf > font that comforms to HKSCS-2001 :-(
That's true. I guess that even those 70000+-glyph super CJK fonts by Founders may not totally conform to HKSCS-2001 yet because they may not have compatibility mappings (in PUA) and the 30 or so characters not yet accepted into ISO 10646. The ITSD provides a font on their website. It only uses the EUDC, but could be a start. BTW, Roger says that the *.TTE may be used under XFree86. I just haven't tried it yet. :-) For now, I just use the Perl script to generate a text file containing Unicode <-> Big5-HKSCS mapping, and then use a small C program to call glibc's functions to do the same, and compare the two with "diff". :-) > The patchs are very big, where can I put it? If you like, you can put it on your website? :-) BTW, I finally got a glibc HKSCS-2001 big5hkscs.c and other test utils up on http://www.thizlinux.com/~anthony/hkscs/ http://people.debian.org/~foka/hkscs/ I am not too happy that the U+20021..U+2F9D4 to Big5-HKSCS mapping takes up so much space (nearly 70 KB for just 1651 entries) because of the sparse array. Anyhow, BIG5HKSCS.so grew from 115165 bytes to 185240 bytes here. Not a big deal, but even the big GB18030.so is only 130356 bytes... :-) I am thinking of using a hash instead for that area, but that'll be an experiment for another day. Fun stuff! :-) I am curious how you implemented yours. Perhaps you know other neat tricks to make the table even smaller? :-) > btw, I only made a HKSCS-2001 iconv module for glibc, the charmap is not > patched yet. Yes, Ulrich wants the charmap and the test data modified too. > I do not understand big5cmp.txt exactly. I only use big5_iso.txt and > CP950 codepage. Andrew Fung at ITSD explained it to me a while ago. Basically, GCCS contained some glyphs which differ slightly in shape from existing characters, but are classified as unique characters. However, when they began making HKSCS, those glyphs are considered variants (Z axis) of the same character by ISO 10646 / Unicode standards. So, they are "unified" with the original character, and the codepoint retained for backward compatibility. I sent you the full explanation in an earlier e-mail. You may also set $debug = 1 in gen-glibc-big5hkscs.pl and see all the "Was: .... Now: ...." stuff and see what it's about. :-) > > Was: U+E33A -> 8E69, 8E69 -> U+E33A Now: U+E33A -> BAE6, 8E69 -> U+7BB8 > > Was: U+E340 -> 8E6F, 8E6F -> U+E340 Now: U+E340 -> EDCA, 8E6F -> U+7C06 > > Was: U+E34F -> 8E7E, 8E7E -> U+E34F Now: U+E34F -> A261, 8E7E -> U+7CCE GCCS mapping: PUA <-> Big5 EUDC; Now: PUA -> Big5 proper; Big5-EUDC to CJK Unified Area. Something like that. > Anthony Fok wrote: > > >On Tue, Mar 19, 2002 at 11:02:54PM +0800, James Su wrote: > > > >>Hi, > >>I made a patch for hkscs too :-) This patch conforms to HKSCS 2001 > >>standard. > >> > >>Regards > >>James Su > >> > > > >Hello James, > > > >Wow, you are so quick! Very impressive, as always! :-) I was going to > >working on a HKSCS-2001 patch for glibc today, (with ISO 10646-1:2000; > >XFree86 and many other programs don't support U+20000 stuff yet), and then > >move on to XFree86. I'm glad you already did the work for X. :-) > > > >BTW, did you use big5cmp.txt too, to deal with unified characters? > >A minor detail, e.g. > > > > > >I'll see how much work I can get done on the glibc HKSCS-2001 > >(ISO 10646-1:2000 version) table, and compare notes with you. > > > >Cheers, > > > >Anthony -- Anthony Fok Tung-Ling ÀNªFÆF ThizLinux Laboratory <[EMAIL PROTECTED]> http://www.thizlinux.com/ Debian Chinese Project <[EMAIL PROTECTED]> http://www.debian.org/intl/zh/ Come visit Our Lady of Victory Camp! http://www.olvc.ab.ca/ _______________________________________________ I18n mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/i18n