Ezio Melotti added the comment:

The Greek sample includes 155 unique characters (including whitespace, 
punctuation, and the english characters at the beginning), so they can all fit 
in the cache.
The Chinese sample however includes 3695 unique characters (all within the 
BMP), probably causing a lot more misses in the cache and a slowdown caused by 
the overhead.
The Chinese text you used for the test is also from some 700 years ago, and 
uses traditional and vernacular Chinese, so the number of unique character is 
higher than what you would normally encounter in modern Chinese.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue31484>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to