On Wed, Aug 16, 2017 at 6:24 PM, Martin Frb via Lazarus <lazarus@lists.lazarus-ide.org> wrote: > Actually no.
I know CodeUnit and CodePoint are not called "character" officially by the Unicode Standard. They however are called "character" in normal communication. For example in the "String vs WideString" thread most people used "character" as a synonym for CodePoint. For CodeUnit the term is very logical for historical reasons as the type "Char" is a short form of "Character". This is a very important meaning because CodeUnit resolution is so useful also with variable width encodings. For example the following code works perfectly with UTF-8 and UTF-16: function SplitInHalf(Txt, Separator: string; out Half1, Half2: string): Boolean; var i: Integer; begin i := Pos(Separator, Txt); Result := i > 0; if Result then begin Half1 := Copy(Txt, 1, i-1); Half2 := Copy(Txt, i+Length(Separator), Length(Txt)); end; end; although Pos(), Copy() and Length() deal with CodeUnit resolution. I wonder how the new fancy string types would handle it without a performance penalty. Juha -- _______________________________________________ Lazarus mailing list Lazarus@lists.lazarus-ide.org https://lists.lazarus-ide.org/listinfo/lazarus