>erm, do you mean that sometimes the same char has different pronounciations? >Or what? you can of course use the xcin input table as a database, because >every chinese char has the pinyin writing in this table. if you want to >differ between certain situations when you have to use which >pronounciation, then of course you should look on the following chars and >have a little dictionary included that decides which pronounciation to use. >I think a simple perl script should fit your needs in this case. > > > $>wc pinyin.cin --> 20976 41951 217690 pinyin.cin
Here is the wc on pinyin.cin I thinkI can first base on this to build a prototype. 20976 line include multiple chars input... so total singal char to pinyin don't really map all the gbk. Also the standard pronouncation of the word should include the tone 1-6 for cantonese 1-4 for mandarin... ftp://ftp.unicode.org/Public/UNIDATA/Unihan.txt in this file you get more detail how it should look like.... in pinyin.cin... may not be enought for my use... of course better then nothing :) Thanks alot of your input Alex -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED] -- | This message was re-posted from [email protected] | and converted from big5 to gb2312 by an automatic gateway.

