Re: [Lazarus] dynamic string proposal

Juha Manninen via Lazarus Wed, 16 Aug 2017 08:56:26 -0700

On Wed, Aug 16, 2017 at 6:24 PM, Martin Frb via Lazarus
<lazarus@lists.lazarus-ide.org> wrote:
> Actually no.


I know CodeUnit and CodePoint are not called "character" officially by
the Unicode Standard.
They however are called "character" in normal communication.
For example in the "String vs WideString" thread most people used
"character" as a synonym for CodePoint.
For CodeUnit the term is very logical for historical reasons as the
type "Char" is a short form of "Character". This is a very important
meaning because CodeUnit resolution is so useful also with variable
width encodings.
For example the following code works perfectly with UTF-8 and UTF-16:

function SplitInHalf(Txt, Separator: string; out Half1, Half2: string): Boolean;
var
  i: Integer;
begin
  i := Pos(Separator, Txt);
  Result := i > 0;
  if Result then
  begin
    Half1 := Copy(Txt, 1, i-1);
    Half2 := Copy(Txt, i+Length(Separator), Length(Txt));
  end;
end;

although Pos(), Copy() and Length() deal with CodeUnit resolution.
I wonder how the new fancy string types would handle it without a
performance penalty.

Juha
-- 
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus-ide.org
https://lists.lazarus-ide.org/listinfo/lazarus

Re: [Lazarus] dynamic string proposal

Reply via email to