On Tue, 6 Nov 2001, John H. Jenkins wrote:
> > On Tuesday, November 6, 2001, at 09:41 AM, Michael Everson wrote: > > > John Jenkins said: > > > >> Ah, but you should never underestimate the power of the Force. Remember > >> that the IRG is *already* looking to adding some 60,000 ideographs for > >> Extension C. > > > > Where do they find these things? > The overwhelming majority of them are coming from medieval Korean Buddhist > documents. I'm dumbstruck to hear that. Sixty thousand ideographs from medieval Korean Buddhist documents !! Oh, my gosh. Do they really have 60,000 Chinese characters not yet encoded? I wondered IRG has given up unification. I know what these documents are (八萬大藏經 or 高麗大藏經 ). At first I thought you may have added extra '0', but the position of comma suggested that you couldn't have. I went to http://www.sutra.re.kr (http://www.sutra.re.kr/english ) and they indeed have the detailed statistics of variants (the number of occurrences and relative frequencey, etc). It's available at <http://211.46.71.249/handic/index.htm>. The default indexing method is based on radicals. Choose a radical in the bottom left panel and the stroke count in the bottom right panel and the list of characters with the radical and stroke count of your choice will show up below the stroke count list. If you click on a character, the top panel will give you details on it (Unicode code point + 'variant index'??, pronunciations in Korean, Chinese and Japanese, etc) At the very end of the top panel, you'll find an icon with TV-like symbol at the right end (The Korean string to its left is '정체,이체자 정보' meaning variant info.). Clicking on it will open up a new window with detailed statistics I mentioned above. I think this is a pretty nice source of Chinese character variants along with much more comprehensive Chinese character variant dictionary in Taiwan (available at http://140.111.1.40) They also have the overall statistics of Chinese characters found in the documents at <http://211.46.71.249/charstatistics>. The more detailed statistics with frequency rank is at <http://211.46.71.249/charstatistics/freqrank.htm>. Jungshik Shin

