Mattias Gaertner schrieb:

Yes. For Unicode encoding we would need new functions to distinguish between number of bytes and number of (visible) glyphs:

LengthInBytes()
LengthInGlyphs()

It should be mentioned that Unicode allows for different encodings of composed/decomposed characters. E.g. 'é' can be stored as 'é' (single composed codepoint) or as '´e' (two decomposed codepoints). Even if both encodings look the same on screen, Pos (or UTF8Pos) will only find the encoding as given in the search string, and it has to be specified what LengthInGlyphs really should return - the number of really visible glyphs, what in case of ligatures etc.?

Every user has to know which kind of "length" he really wants to get:
- number of bytes for storage in a fixed-size variable or streaming
- number of glyphs for length-restricted user input
- number of pixels for GUI layout (TextWidth)
...

DoDi


--
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus.freepascal.org
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Reply via email to