Re: [Lazarus] TStringList.LoadFromFile encoding parameter

Mattias Gaertner Mon, 11 Jul 2016 05:47:39 -0700

On Mon, 11 Jul 2016 10:57:57 +0100
Graeme Geldenhuys <[email protected]> wrote:

> On 2016-07-10 06:20, Martin Schreiber wrote:
> > We always can write that "UnicodeString" is the wrong name for a reference 
> > counted utf-16 string because UTF8String or AnsiString with default code 
> > page 
> > set to utf-8 also is Unicode in order to express our anger about the bad 
> > marketing driven decision of the Delphi owners.  
> 
> G*d, I so agree with that too! I simply hate the name "UnicodeString"
> implicitly implying UTF-16 only. "Unicode" is an algorithm with 3
> official encodings, not just UTF-16.

You know well that the name UnicodeString came from Delphi, where it
fits, because it is their only string supporting Unicode.
No one forces you to use this name in your code. You can define your own
alias type.

> Then to boot, they introduced the AnsiString mess in FPC 3.0 - which now
> doesn't only mean ANSI encoding (contrary to what the name suggests), it
> now means Unicode encodings too. 

1. AnsiString comes from Microsoft ANSI code pages, which was not an
ANSI-standard at all, so the term "Ansi" was a misnomer from the
beginning.
2. MS accepted that and nowadays calls them only "code pages". But many
of their pages still use the term "ANSI code page".
3. The Unicode consortium added UTF-8 specially designed for legacy
code using 8-bit strings.
4. Microsoft added the UTF-8 code page 65001 (and also code pages for
UTF-16 and UTF-32), but no MS Windows used it as system code page.

FPC's AnsiString uses the MS code pages numbers, which includes UTF-8.

The new FPC 3.0 strings made it easier to use UTF-8 strings
- aka you need less conversions and more RTL functions support Unicode -
while still keeping compatibility.

>[...]

Mattias
-- 
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus-ide.org/listinfo/lazarus

Re: [Lazarus] TStringList.LoadFromFile encoding parameter

Reply via email to