I've retried with 3.2-17 with the same results. Notably, the issue isn't
(and has not been) that all multibyte characters are handled properly.
Instead, sequences which contain combining characters seem to treat the
sequence inconsistently. For example, the character that represents D
WITH DOT
Sean Burke wrote:
The Unicode normalization test data at
http://www.unicode.org/Public/UNIDATA/NormalizationTest.txt
contains many sequences of this sort.
The first chara cter sequence, LATIN CAPITAL LETTER D WITH DOT
ABOVE, does produce this problem.
Paste it into the