Re: Errors in Unihan data : simplified/traditional variants

2010-11-01 Thread John H. Jenkins
On 2010/10/30, at 下午8:42, Koxinga wrote: My quickly done parsing program counted 1154 such pairs, where the head character was the same as the character above. It seems to be always in the order kTraditionalVariant then kSimplifiedVariant, so can maybe be automatically corrected. It seems

Errors in Unihan data : simplified/traditional variants

2010-10-31 Thread Koxinga
Hello, I recently looked up the relationships traditional-simplified in the Unihan database (Unihan_Variants.txt). I knew it had mistakes and I wanted to help correct some of them, but the first thing that stand out and surprised me was the large number of lines like : U+346F

Errors in Unihan?

2000-11-14 Thread Pierpaolo Bernardi
Hello, In the Unihan.txt database, in the kMandarin field there are entries with duplicate pronunciations. For example: U+4E21 kMandarin 1 LIANG3 2 LIANG3 3 LIANG4 U+4E4E kMandarin 1 HU1 HU2 2 HU1 U+4E86 kMandarin 1 LIAO3 2 LE LIAO3 Is there a reason for these duplicates?

Re: Errors in Unihan?

2000-11-14 Thread John Jenkins
On Tuesday, November 14, 2000, at 08:24 AM, Pierpaolo Bernardi wrote: In the Unihan.txt database, in the kMandarin field there are entries with duplicate pronunciations. For example: U+4E21kMandarin 1 LIANG3 2 LIANG3 3 LIANG4 U+4E4EkMandarin 1 HU1 HU2 2 HU1