Re: [fpc-devel] Re: enumerators

Hans-Peter Diettrich Mon, 15 Nov 2010 19:16:58 -0800

Marco van de Voort schrieb:

First you would have to come up with a workable model for s[x] being
utf32chars in general that doesn't suffer from O(N^2) performance
degradation (read/write)


Right, UTF-32 or UCS2 were much more useful in computations.

And for it to be useful, it must be workable for more than only the most
basic loops, but also for e.g.

if length(s)>1 then
  for i:=1 to length(s)-1 do
    s[i]:=s[i+1];

and similar loops that might read/write more than once, and use
calculated expressions as the parameter to []

Okay, essentially you outlined why UTF is not a useful encoding forcomputation at all. Above loop body results in O(N^2) for every looptype, be counted or iterated, on data structures with non-uniform elements.

UTF encodings have their primary use in data storage and exchange withexternal API's. A useful data type and implementation would use a SxCSencoding internally, and accept or supply UTF-8 or UTF-16 strings onlywhen explicitly asked for. All meaningful UTF/MBCS implementationsalready come with iterators, and only uneducated people would ever tryto apply their SBCS-based algorithms and habits on MBCS encodings. Atleast they'd change their mind soon, after encountering the first bugsresulting from such an inappropriate approach.

BTW, I just found a similar inappropriate handling of digraphs in thescanner, where checks for such character combinations occur in manyplaces, with no guarantee that all cases really are covered.

Furthermore I think that in detail Unicode string handling should not bebased on single characters at all, but instead should use (sub)stringsall over, covering multibyte character representations, ligatures etc.as well. Then the basic operations would be insertion and deletion ofsubstrings, in addition to substring extraction and concatenation.


DoDi

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Re: [fpc-devel] Re: enumerators

Reply via email to