--- Andrei Zmievski <[EMAIL PROTECTED]> wrote: > 2) Combining sequences are not respected. We can't swap base > character and the combining chars that follow it because the string > may be concatenated with something else and the combining chars may > end up affecting something else. So we need to work at grapheme level > here, using u_getCombiningClass() to check for combining chars and > copying the base+combining as a unit.
OK, so I guess the code should track the combining class and copy out chunks of codepoints with the same class, something like: int32_t prev; /* Last class transition */ uint8_t class = 0; while ( /* iterate backward over string */ ) { while (u_getCombiningClass(codept) == class) { /* Get 'next' codept */ } /* Copy codepts from 'next' to 'prev' */ } Is that correct ? -- Rolland -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php