Marco van de Voort schrieb:
Yes, but the realisation should be that the holding on array indexing is
what makes it expensive. The problem could be strongly reduced by removing
such array indexing skeleton from all routines where it is not necessary.
Why fall from one extreme into the other one? Traditional For loops have
their use with array structures, iterators have their use with other
data structures.
UTF encodings have their primary use in data storage and exchange with
external API's.
And in memory.
That's the design flaw, IMO. When UTF-8 strings are re-encoded before
processing, even insertion and deletion require only linear time. All
that could be encapsulated in a class, with flexible internal string
representation.
Furthermore I think that in detail Unicode string handling should not be
based on single characters at all, but instead should use (sub)strings
all over, covering multibyte character representations, ligatures etc.
as well
This is dog slow. You can make such library for special purposes, but for
most day to day use this is overkill.
I don't think so, and you don't either:
The most common stringoperations that the avg programmer does is searching for
substrings and then split on them, something that can be perfectly done in
UTF-8.
:-)
DoDi
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel