Le 17/05/2021 à 17:17, David Li a écrit :
A little clarification on my point: it's not that a single codepoint
gets encoded with more than four bytes, it's that a grapheme
cluster/human-delimited 'character' might be multiple codepoints, so
reversing the individual codepoints may produce an unexpected
result. For instance a flag emoji is actually two codepoints (two
special 'letter' codepoints that represent the country code), so
reversing a US flag naively will give you an odd '[SU]' instead.

This sounds like saying that reversing a valid French word does not produce a valid French word (well, in most cases). The kernel documentation can't contain an entire tutorial about Unicode characters and what to expect from them, IMHO.

Regards

Antoine.

Reply via email to