Re: [fpc-devel] Unicode in the RTL (my ideas)

Martin Schreiber Tue, 21 Aug 2012 02:16:06 -0700

Am 21.08.2012 09:55, schrieb Graeme Geldenhuys:

On 21 August 2012 07:10, Ivanko B<ivankob4m...@gmail.com>  wrote:

How about supporting in the RTL all versions of UCS-2&  UTF-16 (for
fast per-char access etc optimizations) and UTF-8 (for unlimited
number of alphabets) ?


All "access a char by index into a string" code I have seen, 99.99% of
the time work in a sequential manner. For that reason there is no
speed difference between using a UTF-16 or UTF-8 encoded string. Both
can be coded equally efficient.

Graeme, this is simply not true. Searching for known German charactersin a UnicodeString the program can use the simple approach by character(code unit) index. It is even possible for known Chinese symbols of theBMP. And a simple "if" for surrogate pairs is more efficent as a 4-stage"case" for utf-8.


Martin
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Unicode in the RTL (my ideas)

Reply via email to