Martin Schreiber schrieb:
All "access a char by index into a string" code I have seen, 99.99% of
the time work in a sequential manner. For that reason there is no
speed difference between using a UTF-16 or UTF-8 encoded string. Both
can be coded equally efficient.
Graeme, this is simply not true. Searching for known German characters
in a UnicodeString the program can use the simple approach by character
(code unit) index. It is even possible for known Chinese symbols of the
BMP. And a simple "if" for surrogate pairs is more efficent as a 4-stage
"case" for utf-8.
The good ole Pos() can do that, why search for more complicated
implementations?
You still try to use old coding patterns which are simply inappropriate
for dealing with Unicode strings. Why make a distinction between
searching for a single character or multiple characters, when it's known
that one character can require multiple bytes or words in UTF-8/16?
DoDi
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel