Ram Viswanadha wrote:
thanks. not really. I am not look into the ratio caused by encoding. But rather the ratio caused by language itself. For example, in order to communicate the idea "I want to eat chicken for dinner tonight", French, German using the same encoding may use different number of characters to communicate the same "IDEA". There is also some information athttp://oss.software.ibm.com/icu/docs/papers/binary_ordered_compression_for_unicode.html#Test_ResultsNot sure if this is what you are looking for.
Misha's paper help a lot. but unfortunately it lack of japanese and German data.

