Thanks for your interests and comments. A korean standard body had made hangul character frequency statistics to choose most frequent 2350 hangul characters (freq > 0.001%) for KSC5601 legacy hangul code system. Reordering table v3.0 may contain the full- 2350 or top-1024 KSC5601 hangul characters. It's the perfect set that korean gov approved with some statistics. :-)
I know CN/TW/JP/KR gov standard body/commitees have made or maintains han char statistics for similar purposes. CN/TW/KR/JP have 4000~13000 frequent han chars of legacy chinese code system. It's very clear the # of common TC characters of CN/TW/KR/JP are less than 4800. And For each TC character in the common set, we add its SC/Kanji variants into the common set. Then we get about 6000~7000 han characters which CN/TW/KR/JP GOVs approved as frequent ones and have some statistics on . But even such authoritative statistics cannot be the optimal ones. ALWAYS sub-optimal, just have authoritative and maintained sources. Some additional mixing/tuning/training within those statistics are needed . Welcome any suggestions. Regards, Soobok Lee ----- Original Message ----- From: "xiaodong lee" <[EMAIL PROTECTED]> To: "Soobok Lee" <[EMAIL PROTECTED]>; "James Seng/Personal" <[EMAIL PROTECTED]>; "Bruce Thomson" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Friday, November 09, 2001 8:29 PM Subject: Re: [idn] summary of reordering discussion > That is great. > We need to do it and find some org to support it. not to deny it simply. > If we use some authoritative data to make some result, it will be > more useful for people to use. > ----- Original Message ----- > 发件人: "Soobok Lee" <[EMAIL PROTECTED]> > 收件人: "James Seng/Personal" <[EMAIL PROTECTED]>; "Bruce Thomson" ><[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> > 发送时间: 2001年11月9日 下午 12:41 > 主题: Re: [idn] summary of reordering discussion > > > > ----- Original Message ----- > From: "James Seng/Personal" <[EMAIL PROTECTED]> > > > > This is the biggest problem I have with reordering, ie, the lack to > > reference a creditable table. And yes, there is no table to reference > > unfortunately. > > I have found some govermental authorites that published > its official script character frequency statistics. > > > > > > -James Seng > > > >
