On 26.12.2019 19:29, Michael Van Canneyt wrote:
So no, I don't think these need to be changed/merged. What IMO can be discussed is which of these 2 need to be used as the default codepage in other code. It
should then resolve the problems that appear, I think.

That would be possible as well. But still please reconsider it:
One reason: just from the convention - the default codepage to use should be TEncoding.Default. That is intuitive. Second reason: Now we have TEncoding.ANSI = TEncoding.Default. 2 equal properties. And another FPC-only property TEncoding.SystemEncoding. That means 3 properties for 2 values.
---

In Delphi TEncoding.ANSI and TEncoding.Default are actually different. See:
http://docwiki.embarcadero.com/Libraries/Rio/en/System.SysUtils.TEncoding.Default
http://docwiki.embarcadero.com/Libraries/Rio/en/System.SysUtils.TEncoding.ANSI

On Windows, they are equal but on POSIX they are different: TEncoding.Default is UTF-8 but TEncoding.ANSI is the code page from CFLocaleGetIdentifier.

Read the .NET docs about Encoding.Default:
https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding.default?redirectedfrom=MSDN&view=netframework-4.8#System_Text_Encoding_Default
on .NET Framework it is ANSI but on .NET Core it is UTF-8 even on Windows.

With all the information from the docs, I am more and more convinced that TEncoding.SystemEncoding is superfluous and TEncoding.Default should take over its meaning: TEncoding.Default should reflect changes in DefaultSystemCodePage. Whereas TEncoding.ANSI should stay a fixed ANSI code page. With it there is no need for TEncoding.SystemEncoding.

With this change, in the current Lazarus UTF-8 solution, TEncoding.Default will be UTF-8. In the future Unicode and Delphi-compatible FPC/Lazarus, TEncoding.Default will get the Delphi meaning (ANSI/UTF-8). IMO the concept is very sensible.

---

Btw. you have a bug in:

constructor TStringStream.CreateRaw(const AString: RawByteString);
var
  CP: TSystemCodePage;
begin
  CP:=StringCodePage(AString);
  if (CP=CP_ACP) or (CP=TEncoding.Default.CodePage) then // this line is wrong
    begin
    FEncoding:=TEncoding.Default;
    FOwnsEncoding:=False;
    end
  else

In the code above, TEncoding.Default is used if CP=CP_ACP. That is currently wrong - the bug perfectly reflects my suggestion for TEncoding.Default change. Currently, CP_ACP corresponds with DefaultSystemEncoding and thus with TEncoding.SystemEncoding and not TEncoding.Default. TEncoding.Default corresponds with ANSI (that is not CP_ACP as documented https://wiki.freepascal.org/FPC_Unicode_support ).

The code should be:
if (CP=CP_ACP) or (CP=TEncoding.SystemEncoding.CodePage) then
begin
  FEncoding:=TEncoding.SystemEncoding;
  FOwnsEncoding:=False;
end else
if (CP=TEncoding.Default.CodePage) then
begin
  FEncoding:=TEncoding.Default;
  FOwnsEncoding:=False;
end else
// ...

The current CreateRaw code is correct for my suggestion. As you can see you intuitively expected the approach I am suggesting :)

Ondrej

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to