On Thu, 24 Apr 2014 23:07:58 +0200 Mathias Bynens <[email protected]> wrote:
> I realize reversing a string has nothing to do with text segmentation > – but ignoring grapheme extenders leads to unexpected results (since > after reversing the code points, the grapheme extender might extend > the wrong character): > https://github.com/mathiasbynens/esrever/issues/5 Actually, it has a lot to do with text segmentation - you need to work out what are really thought of as the characters. שָׁלוֹם is a nice illustration of the problems. Is reversing twice to yield the string you first started with? Is reversing three times to give the same result as reversing once? What does reversing a Hangul syllable do? Canonically equivalence should be preserved! Should renderability be preserved? What does Thai เกราะ /krɔ̀ʔ/ <U+0E40, U+0E01, U+0E23, U+0E32, U+0E30> reverse to? /ʔɔ̀rk/ is unpronounceable in Thai, and if it were it would be written อ็อรก <U+0E2D, U+0E47, U+0E2D, U+0E23, U+0E01>. Thai เพลา <U+0E40, U+0E1E, U+0E25, U+0E32> is the spelling of two unrelated words, pronounced /pʰlaw/ and /pheː laː/ respectively. Richard. _______________________________________________ Unicode mailing list [email protected] http://unicode.org/mailman/listinfo/unicode

