On Aug 22, 2005, at 10:48 PM, Tex Texin wrote:
One more comment- We should take into account that most data will not
use combining characters and should optimize for that case.
Most text will consist of solely characters with combining class = 0.
We can therefore scan backwards, copying characters while cc=0.
Only if we see a non-zero cc do we need to do anything special for
combining chars.
In that case you can use the breakiterator to continue, of if you
prefer to do it on your own,
keep scanning over characters with cc<>0 until the next character with
cc=0, and then copy that one base character and its trailing combining
chars to the end of the result string.
Repeat until the beginning of the string.
I prefer we do not use break iterators for this, since those depend on
locale and we agreed to leave locale-dependent funcs to the unicode
extension.
-Andrei
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php