Marco van de Voort schrieb:

Yes, but the realisation should be that the holding on array indexing is
what makes it expensive. The problem could be strongly reduced by removing
such array indexing skeleton from all routines where it is not necessary.

Why fall from one extreme into the other one? Traditional For loops have their use with array structures, iterators have their use with other data structures.


UTF encodings have their primary use in data storage and exchange with external API's.

And in memory.

That's the design flaw, IMO. When UTF-8 strings are re-encoded before processing, even insertion and deletion require only linear time. All that could be encapsulated in a class, with flexible internal string representation.


Furthermore I think that in detail Unicode string handling should not be based on single characters at all, but instead should use (sub)strings all over, covering multibyte character representations, ligatures etc. as well

This is dog slow. You can make such library for special purposes, but for
most day to day use this is overkill.

I don't think so, and you don't either:

The most common stringoperations that the avg programmer does is searching for
substrings and then split on them, something that can be perfectly done in
UTF-8.

:-)

DoDi

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to