Marco van de Voort schrieb:
In our previous episode, Hans-Peter Diettrich said:
utf8/16 -> ansi are a bit more involved. (since mapping many chars to few,
naieve implementation requiring large lookupsets)
A single 256 element array can be used for both directions. In Ansi to
Unicode the char value is used to index the array of Unicode values,
otherwise the given Unicode value is searched in the array.
That is an option also of course, but O(n).
I'm not sure whether this is a valid argument here. A constant n=256 is
equivalent to O(1) - it may be a single machine instruction. Effectively
the array size is only 128, because ASCII maps 1:1 to Unicode.
P.S.: Above applies to SBCS only, MBCS require more complex solutions.
Probably the better solution is
what was mentioned before, have a set of ranges and smaller lookuptables for
these ranges.
This lowers the set size at the expense of a few (constant time)
comparisons.
See above :-)
In any case a single lookup table, for both directions, reduces memory
requirements and error prone implementation efforts.
DoDi
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel