Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2020-01-03 Thread Ondrej Pokorny

On 03.01.2020 11:09, Michael Van Canneyt wrote:
I also think it is very hypothetical, and not a problem unless you 
want to use

the same stream in Delphi and FPC.

Well, you have my blessing for the soPreserveBOM :)


Added in r43848. I'll check how the FPC documentation works and try to 
add it there.


Ondrej

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2020-01-03 Thread Michael Van Canneyt



On Fri, 3 Jan 2020, Ondrej Pokorny wrote:


On 03.01.2020 10:14, Michael Van Canneyt wrote:

On Fri, 3 Jan 2020, Ondrej Pokorny wrote:


On 03.01.2020 00:35, Werner Pamler wrote:

Am 02.01.2020 um 20:10 schrieb Ondrej Pokorny:
TStrings.SaveTo*() writes BOM by default. Formerly the BOM was not 
written.


There is also the problem that currently it is not possible, without 
further action, to retain the BOM state of a file loaded into a 
stringlist, modified and written back because the presence of a BOM 
is forgotten after reading - see the other discussion


(https://lists.freepascal.org/pipermail/fpc-devel/2020-January/042363.html). 


Wouldn't it make sense to introduce a )Read)BOM property (a boolean 
parameter or an element of the new Options) which gets its value 
when the file is loaded or the strings are assigned? Then the user 
can set the StringList.WriteBOM equal to the StringList.BOM when he 
wants to keep the BOM for writing back.


Yes, that is perfectly reasonable. I'd prefer a new element in 
Options but there is the risk that Delphi adds a new option in the 
future and then we'll have a problem. So maybe a "PreserveBOM" or 
"SetWriteBOMOnLoad" property will be better. When set to true, 
WriteBOM will be set in LoadFrom*() according to BOM presence of the 
loaded file.


I don't see why a new option is a problem ? They are not streamed anyway.

So I would do the opposite, add an option. soPreserveBOM.


If you are fine with it, me as well.

Yes, the problem is if somebody streams the property or uses 
Ord(soPreserveBOM) for something etc. I admit that it is a very 
hypothetical issue.


I also think it is very hypothetical, and not a problem unless you want to use
the same stream in Delphi and FPC.

Well, you have my blessing for the soPreserveBOM :)

Michael.

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2020-01-03 Thread Ondrej Pokorny

On 03.01.2020 10:14, Michael Van Canneyt wrote:

On Fri, 3 Jan 2020, Ondrej Pokorny wrote:


On 03.01.2020 00:35, Werner Pamler wrote:

Am 02.01.2020 um 20:10 schrieb Ondrej Pokorny:
TStrings.SaveTo*() writes BOM by default. Formerly the BOM was not 
written.


There is also the problem that currently it is not possible, without 
further action, to retain the BOM state of a file loaded into a 
stringlist, modified and written back because the presence of a BOM 
is forgotten after reading - see the other discussion
(https://lists.freepascal.org/pipermail/fpc-devel/2020-January/042363.html). 

Wouldn't it make sense to introduce a )Read)BOM property (a boolean 
parameter or an element of the new Options) which gets its value 
when the file is loaded or the strings are assigned? Then the user 
can set the StringList.WriteBOM equal to the StringList.BOM when he 
wants to keep the BOM for writing back.


Yes, that is perfectly reasonable. I'd prefer a new element in 
Options but there is the risk that Delphi adds a new option in the 
future and then we'll have a problem. So maybe a "PreserveBOM" or 
"SetWriteBOMOnLoad" property will be better. When set to true, 
WriteBOM will be set in LoadFrom*() according to BOM presence of the 
loaded file.


I don't see why a new option is a problem ? They are not streamed anyway.

So I would do the opposite, add an option. soPreserveBOM.


If you are fine with it, me as well.

Yes, the problem is if somebody streams the property or uses 
Ord(soPreserveBOM) for something etc. I admit that it is a very 
hypothetical issue.


Ondrej

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2020-01-03 Thread Michael Van Canneyt



On Fri, 3 Jan 2020, Ondrej Pokorny wrote:


On 03.01.2020 00:35, Werner Pamler wrote:

Am 02.01.2020 um 20:10 schrieb Ondrej Pokorny:
TStrings.SaveTo*() writes BOM by default. Formerly the BOM was not 
written.


There is also the problem that currently it is not possible, without 
further action, to retain the BOM state of a file loaded into a 
stringlist, modified and written back because the presence of a BOM is 
forgotten after reading - see the other discussion 

(https://lists.freepascal.org/pipermail/fpc-devel/2020-January/042363.html). 
Wouldn't it make sense to introduce a )Read)BOM property (a boolean 
parameter or an element of the new Options) which gets its value when 
the file is loaded or the strings are assigned? Then the user can set 
the StringList.WriteBOM equal to the StringList.BOM when he wants to 
keep the BOM for writing back.


Yes, that is perfectly reasonable. I'd prefer a new element in Options 
but there is the risk that Delphi adds a new option in the future and 
then we'll have a problem. So maybe a "PreserveBOM" or 
"SetWriteBOMOnLoad" property will be better. When set to true, WriteBOM 
will be set in LoadFrom*() according to BOM presence of the loaded file.


I don't see why a new option is a problem ? They are not streamed anyway.

So I would do the opposite, add an option. soPreserveBOM.

Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2020-01-03 Thread Ondrej Pokorny

On 03.01.2020 00:35, Werner Pamler wrote:

Am 02.01.2020 um 20:10 schrieb Ondrej Pokorny:
TStrings.SaveTo*() writes BOM by default. Formerly the BOM was not 
written.


There is also the problem that currently it is not possible, without 
further action, to retain the BOM state of a file loaded into a 
stringlist, modified and written back because the presence of a BOM is 
forgotten after reading - see the other discussion 
(https://lists.freepascal.org/pipermail/fpc-devel/2020-January/042363.html). 
Wouldn't it make sense to introduce a )Read)BOM property (a boolean 
parameter or an element of the new Options) which gets its value when 
the file is loaded or the strings are assigned? Then the user can set 
the StringList.WriteBOM equal to the StringList.BOM when he wants to 
keep the BOM for writing back.


Yes, that is perfectly reasonable. I'd prefer a new element in Options 
but there is the risk that Delphi adds a new option in the future and 
then we'll have a problem. So maybe a "PreserveBOM" or 
"SetWriteBOMOnLoad" property will be better. When set to true, WriteBOM 
will be set in LoadFrom*() according to BOM presence of the loaded file.


Ondrej

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2020-01-02 Thread Werner Pamler

Am 02.01.2020 um 20:10 schrieb Ondrej Pokorny:
TStrings.SaveTo*() writes BOM by default. Formerly the BOM was not 
written.


There is also the problem that currently it is not possible, without 
further action, to retain the BOM state of a file loaded into a 
stringlist, modified and written back because the presence of a BOM is 
forgotten after reading - see the other discussion 
(https://lists.freepascal.org/pipermail/fpc-devel/2020-January/042363.html). 
Wouldn't it make sense to introduce a )Read)BOM property (a boolean 
parameter or an element of the new Options) which gets its value when 
the file is loaded or the strings are assigned? Then the user can set 
the StringList.WriteBOM equal to the StringList.BOM when he wants to 
keep the BOM for writing back.


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2020-01-02 Thread Ondrej Pokorny

On 27.12.2019 12:01, Ondrej Pokorny wrote:

On 27.12.2019 10:40, Michael Van Canneyt wrote:

Yes, indeed. Therefore I suggested
* TEncoding.Default for the DefaultSystemCodePage variable
and
* TEncoding.ANSI for the system encoding.

Currently we have
* TEncoding.SystemEncoding for the DefaultSystemCodePage variable
and
* both TEncoding.ANSI and TEncoding.Default for the system encoding. 
(TEncoding.ANSI and TEncoding.Default are equal in FPC.)


In that case,  why not simply change:

 class function TEncoding.GetDefault: TEncoding;
 begin
   Result := GetSystemEncoding;
 end;

Nothing need be removed. I consider SystemEncoding a better name than 
Default,
and the latter should only be kept for Delphi compatibility. IMHO it 
would be

better to avoid Default, in fact I would change references to Default to
SystemEncoding for clarity. Default is completely non-descriptive.

If I understand your reasoning correct, that should solve the 
problems you

report ?


Yes, that perfectly solves the issues Lazarus developers and users 
face. I am OK with this solution as well. Thanks!


I applied the change

class function TEncoding.GetDefault: TEncoding;
 begin
   Result := GetSystemEncoding;
 end;

in r43842 before it gets forgotten. I removed the ANSI-hack from Lazarus 
as well - in r62474.


Please note that in Lazarus (where the system encoding is UTF-8), 
TStrings.SaveTo*() writes BOM by default. Formerly the BOM was not 
written. Bart reported the issue here: 
https://lists.freepascal.org/pipermail/fpc-devel/2020-January/042372.html


Ondrej

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-27 Thread Mattias Gaertner via fpc-devel
On Fri, 27 Dec 2019 12:01:24 +0100
Ondrej Pokorny  wrote:

>[...]
> > If I understand your reasoning correct, that should solve the
> > problems you
> > report ?  
> 
> Yes, that perfectly solves the issues Lazarus developers and users
> face. I am OK with this solution as well. Thanks!

Thank you both \O/

Mattias
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-27 Thread Ondrej Pokorny

On 27.12.2019 10:40, Michael Van Canneyt wrote:

Yes, indeed. Therefore I suggested
* TEncoding.Default for the DefaultSystemCodePage variable
and
* TEncoding.ANSI for the system encoding.

Currently we have
* TEncoding.SystemEncoding for the DefaultSystemCodePage variable
and
* both TEncoding.ANSI and TEncoding.Default for the system encoding. 
(TEncoding.ANSI and TEncoding.Default are equal in FPC.)


In that case,  why not simply change:

 class function TEncoding.GetDefault: TEncoding;
 begin
   Result := GetSystemEncoding;
 end;

Nothing need be removed. I consider SystemEncoding a better name than 
Default,
and the latter should only be kept for Delphi compatibility. IMHO it 
would be

better to avoid Default, in fact I would change references to Default to
SystemEncoding for clarity. Default is completely non-descriptive.

If I understand your reasoning correct, that should solve the problems 
you

report ?


Yes, that perfectly solves the issues Lazarus developers and users face. 
I am OK with this solution as well. Thanks!


Ondrej

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-27 Thread Michael Van Canneyt



On Fri, 27 Dec 2019, Ondrej Pokorny wrote:


On 27.12.2019 0:19, Michael Van Canneyt wrote:

On Thu, 26 Dec 2019, Ondrej Pokorny wrote:


On 26.12.2019 19:29, Michael Van Canneyt wrote:
So no, I don't think these need to be changed/merged. What IMO can 
be discussed is
which of these 2 need to be used as the default codepage in other 
code. It

should then resolve the problems that appear, I think.


That would be possible as well. But still please reconsider it:
One reason: just from the convention - the default codepage to use 
should be TEncoding.Default. That is intuitive.


Second reason: Now we have TEncoding.ANSI = TEncoding.Default. 2 
equal properties. And another FPC-only property 
TEncoding.SystemEncoding. That means 3 properties for 2 values.


As far as I know, TEncoding.ANSI = CP_ACP.


This is indeed not correct. See 
https://wiki.freepascal.org/FPC_Unicode_support :
CP_ACP: this value represents the currently set "default system code 
page". See #Code page settings for more information.


I meant the windows meaning of CP_ACP, not what the RTL makes of it. 
I think the use of CP_ACP in the RTL is quite dubious.


Using CP_SYSTEM or so would have been better. No doubt again a Delphi
compatibility naming :(


TMBCSEncoding.Create(widestringmanager.GetStandardCodePageProc(scpAnsi))


This corresponds to what I meant.



and
  TStandardCodePageEnum = (
    scpAnsi, // system Ansi code page (GetACP on windows)

- as you can see the CP_ACP value does not correspond with the GetACP 
WinAPI call result. (But this is wanted as documented in 
https://wiki.freepascal.org/FPC_Unicode_support ).


Why should this equal TEncoding.Default ? 


sysencoding.inc:

class function TEncoding.GetDefault: TEncoding;
begin
  Result := GetANSI;
end;


I know it is currently so, the question is : why ? :)

Maybe Default is better SystemEncoding, see below.




I think  TEncoding.Default  = CP_UTF8 on linux ?


Yes, in FPC this is correct. Also TEncoding.ANSI =CP_UTF8 on linux in FPC.


Not necessarily, if I read the wiki page correctly.





The main problem I see is that there is the system (OS) encoding, and the
encoding specified by DefaultSystemCodePage.

These do not necessarily agree. So it makes sense to have 2 
TEncodings: one

for the system encoding, one for the DefaultSystemCodePage variable. They
will not be equal.

If they were, then the DefaultSystemCodePage variable makes no sense 
whatever.


Yes, indeed. Therefore I suggested
* TEncoding.Default for the DefaultSystemCodePage variable
and
* TEncoding.ANSI for the system encoding.

Currently we have
* TEncoding.SystemEncoding for the DefaultSystemCodePage variable
and
* both TEncoding.ANSI and TEncoding.Default for the system encoding. 
(TEncoding.ANSI and TEncoding.Default are equal in FPC.)


In that case,  why not simply change:

 class function TEncoding.GetDefault: TEncoding;
 begin
   Result := GetSystemEncoding;
 end;

Nothing need be removed. I consider SystemEncoding a better name than Default,
and the latter should only be kept for Delphi compatibility. IMHO it would be
better to avoid Default, in fact I would change references to Default to
SystemEncoding for clarity. Default is completely non-descriptive.

If I understand your reasoning correct, that should solve the problems you
report ?

Michael.___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-26 Thread Ondrej Pokorny

On 26.12.2019 23:42, Marco van de Voort wrote:

Op 12/26/2019 om 9:12 PM schreef Ondrej Pokorny:


In Delphi TEncoding.ANSI and TEncoding.Default are actually 
different. See:
http://docwiki.embarcadero.com/Libraries/Rio/en/System.SysUtils.TEncoding.Default 

http://docwiki.embarcadero.com/Libraries/Rio/en/System.SysUtils.TEncoding.ANSI 



On Windows, they are equal but on POSIX they are different: 
TEncoding.Default is UTF-8 but TEncoding.ANSI is the code page from 
CFLocaleGetIdentifier.



And in FPC it is exactly the same,


No, it is not. In FPC:
class function TEncoding.GetDefault: TEncoding;
begin
  Result := GetANSI;
end;

For the meaning of TEncoding.Default and TEncoding.ANSI in Delphi see 
the docs above.




BUT Lazarus overrides default with UTF8 on Windows.


Yes, it does it since r61976 - so only recently. And it is a very 
questionable commit because

a) is not Delphi compatible
b) breaks OS-ANSI calls
c) breaks ANSI FPC code

It must either be reverted or we need some high-level method to get the 
OS-ANSI codepage without this override.




As you can see that is NOT compatible with Delphi above.
Yes, and I am against r61976 - but because r61976 overrides 
TEncoding.ANSI to UTF-8 on Windows. IMO TEncoding.Default should be 
UTF-8 in Lazarus even on Windows (whereas TEncoding.ANSI should stay 
OS-ANSI) - I try to explain again why this actually fits very well into 
the Delphi/FPC encoding concept. I will now talk only about Windows for 
simplicity (because the ANSI concept is most important on Windows):


Delphi doesn't know the DefaultSystemEncoding concept that FPC has. The 
default AnsiString encoding in Delphi is always OS-ANSI (CP_ACP). 
Therefore it makes perfect sense to have TEncoding.Default to point to 
the default AnsiString encoding that is OS-ANSI encoding in Delphi.


FPC, on the contrary, overrides the CP_ACP value with 
DefaultSystemEncoding. So the default AnsiString encoding is not OS-ANSI 
but DefaultSystemEncoding. Therefore, again, it makes perfect sense to 
have TEncoding.Default to point to the default AnsiString encoding that 
is DefaultSystemEncoding in FPC.


Onrej
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-26 Thread Ondrej Pokorny

On 27.12.2019 0:19, Michael Van Canneyt wrote:

On Thu, 26 Dec 2019, Ondrej Pokorny wrote:


On 26.12.2019 19:29, Michael Van Canneyt wrote:
So no, I don't think these need to be changed/merged. What IMO can 
be discussed is
which of these 2 need to be used as the default codepage in other 
code. It

should then resolve the problems that appear, I think.


That would be possible as well. But still please reconsider it:
One reason: just from the convention - the default codepage to use 
should be TEncoding.Default. That is intuitive.


Second reason: Now we have TEncoding.ANSI = TEncoding.Default. 2 
equal properties. And another FPC-only property 
TEncoding.SystemEncoding. That means 3 properties for 2 values.


As far as I know, TEncoding.ANSI = CP_ACP.


This is indeed not correct. See 
https://wiki.freepascal.org/FPC_Unicode_support :
CP_ACP: this value represents the currently set "default system code 
page". See #Code page settings for more information.


The code for it is in sysos.inc:
function TranslatePlaceholderCP(cp: TSystemCodePage): TSystemCodePage; 
{$ifdef SYSTEMINLINE}inline;{$endif}

begin
  TranslatePlaceholderCP:=cp;
  case cp of
    CP_OEMCP:
  TranslatePlaceholderCP:=GetOEMCP;
    CP_ACP:
  TranslatePlaceholderCP:=DefaultSystemCodePage;
  end;
end;

Whereas TEncoding.ANSI is the WIN-ANSI OS encoding:

class function TEncoding.GetANSI: TEncoding;
// ...
    FStandardEncodings[seAnsi] := 
TMBCSEncoding.Create(widestringmanager.GetStandardCodePageProc(scpAnsi))


and
  TStandardCodePageEnum = (
    scpAnsi, // system Ansi code page (GetACP on windows)

- as you can see the CP_ACP value does not correspond with the GetACP 
WinAPI call result. (But this is wanted as documented in 
https://wiki.freepascal.org/FPC_Unicode_support ).


Why should this equal TEncoding.Default ? 


sysencoding.inc:

class function TEncoding.GetDefault: TEncoding;
begin
  Result := GetANSI;
end;


I think  TEncoding.Default  = CP_UTF8 on linux ?


Yes, in FPC this is correct. Also TEncoding.ANSI =CP_UTF8 on linux in FPC.



The main problem I see is that there is the system (OS) encoding, and the
encoding specified by DefaultSystemCodePage.

These do not necessarily agree. So it makes sense to have 2 
TEncodings: one

for the system encoding, one for the DefaultSystemCodePage variable. They
will not be equal.

If they were, then the DefaultSystemCodePage variable makes no sense 
whatever.


Yes, indeed. Therefore I suggested
* TEncoding.Default for the DefaultSystemCodePage variable
and
* TEncoding.ANSI for the system encoding.

Currently we have
* TEncoding.SystemEncoding for the DefaultSystemCodePage variable
and
* both TEncoding.ANSI and TEncoding.Default for the system encoding. 
(TEncoding.ANSI and TEncoding.Default are equal in FPC.)


Ondrej

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-26 Thread Michael Van Canneyt



On Thu, 26 Dec 2019, Ondrej Pokorny wrote:


On 26.12.2019 19:29, Michael Van Canneyt wrote:
So no, I don't think these need to be changed/merged. What IMO can be 
discussed is
which of these 2 need to be used as the default codepage in other 
code. It

should then resolve the problems that appear, I think.


That would be possible as well. But still please reconsider it:
One reason: just from the convention - the default codepage to use 
should be TEncoding.Default. That is intuitive.


Second reason: Now we have TEncoding.ANSI = TEncoding.Default. 2 equal 
properties. And another FPC-only property TEncoding.SystemEncoding. That 
means 3 properties for 2 values.


As far as I know, TEncoding.ANSI = CP_ACP. 
Why should this equal TEncoding.Default ? 
I think  TEncoding.Default  = CP_UTF8 on linux ?


The main problem I see is that there is the system (OS) encoding, and the
encoding specified by DefaultSystemCodePage.

These do not necessarily agree. So it makes sense to have 2 TEncodings: one
for the system encoding, one for the DefaultSystemCodePage variable. They
will not be equal.

If they were, then the DefaultSystemCodePage variable makes no sense whatever.

Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-26 Thread Marco van de Voort

Op 12/26/2019 om 9:12 PM schreef Ondrej Pokorny:


In Delphi TEncoding.ANSI and TEncoding.Default are actually different. 
See:
http://docwiki.embarcadero.com/Libraries/Rio/en/System.SysUtils.TEncoding.Default 

http://docwiki.embarcadero.com/Libraries/Rio/en/System.SysUtils.TEncoding.ANSI 



On Windows, they are equal but on POSIX they are different: 
TEncoding.Default is UTF-8 but TEncoding.ANSI is the code page from 
CFLocaleGetIdentifier.


And in FPC it is exactly the same, BUT Lazarus overrides default with 
UTF8 on Windows. As you can see that is NOT compatible with Delphi above.


Worse, since the startup encoding is the encoding to communicate with 
the OS, as soon as



Read the .NET docs about Encoding.Default:
https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding.default?redirectedfrom=MSDN=netframework-4.8#System_Text_Encoding_Default 

on .NET Framework it is ANSI but on .NET Core it is UTF-8 even on 


Yes, totally irrelevant. On Windows ansi means something like 
Windows-1252 and  -A apis, and the only unicode api is -W and UTF8. .NET 
is as relevant as Linux in this matter; other application API.


With all the information from the docs, I am more and more convinced 
that TEncoding.SystemEncoding is superfluous and TEncoding.Default 
should take over its meaning: TEncoding.Default should reflect changes 
in DefaultSystemCodePage. Whereas TEncoding.ANSI should stay a fixed 
ANSI code page. With it there is no need for TEncoding.SystemEncoding.
The defaultsystemencoding changes the meaning of the codepage for the 
application libraries (read: the pascal parts), NOT for the delphi api.




With this change, in the current Lazarus UTF-8 solution, 
TEncoding.Default will be UTF-8. In the future Unicode and 
Delphi-compatible FPC/Lazarus, TEncoding.Default will get the Delphi 
meaning (ANSI/UTF-8). IMO the concept is very sensible.

Delphi is UTF-16. UTF-8 is only used for document formats, not for APIs.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-26 Thread Bart via fpc-devel
On Thu, Dec 26, 2019 at 9:12 PM Ondrej Pokorny  wrote:

> With all the information from the docs, I am more and more convinced
> that TEncoding.SystemEncoding is superfluous and TEncoding.Default
> should take over its meaning: TEncoding.Default should reflect changes
> in DefaultSystemCodePage. Whereas TEncoding.ANSI should stay a fixed
> ANSI code page. With it there is no need for TEncoding.SystemEncoding.

I agree with Ondrej on this point.

> With this change, in the current Lazarus UTF-8 solution,
> TEncoding.Default will be UTF-8. In the future Unicode and
> Delphi-compatible FPC/Lazarus, TEncoding.Default will get the Delphi
> meaning (ANSI/UTF-8). IMO the concept is very sensible.

It would make life much easier for the Lazarus developers.
Currently we're kind of fighting the compiler, which is not good.

-- 
Bart
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-26 Thread Ondrej Pokorny

On 26.12.2019 19:29, Michael Van Canneyt wrote:
So no, I don't think these need to be changed/merged. What IMO can be 
discussed is
which of these 2 need to be used as the default codepage in other 
code. It

should then resolve the problems that appear, I think.


That would be possible as well. But still please reconsider it:
One reason: just from the convention - the default codepage to use 
should be TEncoding.Default. That is intuitive.
Second reason: Now we have TEncoding.ANSI = TEncoding.Default. 2 equal 
properties. And another FPC-only property TEncoding.SystemEncoding. That 
means 3 properties for 2 values.

---

In Delphi TEncoding.ANSI and TEncoding.Default are actually different. See:
http://docwiki.embarcadero.com/Libraries/Rio/en/System.SysUtils.TEncoding.Default
http://docwiki.embarcadero.com/Libraries/Rio/en/System.SysUtils.TEncoding.ANSI

On Windows, they are equal but on POSIX they are different: 
TEncoding.Default is UTF-8 but TEncoding.ANSI is the code page from 
CFLocaleGetIdentifier.


Read the .NET docs about Encoding.Default:
https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding.default?redirectedfrom=MSDN=netframework-4.8#System_Text_Encoding_Default
on .NET Framework it is ANSI but on .NET Core it is UTF-8 even on Windows.

With all the information from the docs, I am more and more convinced 
that TEncoding.SystemEncoding is superfluous and TEncoding.Default 
should take over its meaning: TEncoding.Default should reflect changes 
in DefaultSystemCodePage. Whereas TEncoding.ANSI should stay a fixed 
ANSI code page. With it there is no need for TEncoding.SystemEncoding.


With this change, in the current Lazarus UTF-8 solution, 
TEncoding.Default will be UTF-8. In the future Unicode and 
Delphi-compatible FPC/Lazarus, TEncoding.Default will get the Delphi 
meaning (ANSI/UTF-8). IMO the concept is very sensible.


---

Btw. you have a bug in:

constructor TStringStream.CreateRaw(const AString: RawByteString);
var
  CP: TSystemCodePage;
begin
  CP:=StringCodePage(AString);
  if (CP=CP_ACP) or (CP=TEncoding.Default.CodePage) then // this line 
is wrong

    begin
    FEncoding:=TEncoding.Default;
    FOwnsEncoding:=False;
    end
  else

In the code above, TEncoding.Default is used if CP=CP_ACP. That is 
currently wrong - the bug perfectly reflects my suggestion for 
TEncoding.Default change. Currently, CP_ACP corresponds with 
DefaultSystemEncoding and thus with TEncoding.SystemEncoding and not 
TEncoding.Default. TEncoding.Default corresponds with ANSI (that is not 
CP_ACP as documented https://wiki.freepascal.org/FPC_Unicode_support ).


The code should be:
if (CP=CP_ACP) or (CP=TEncoding.SystemEncoding.CodePage) then
begin
  FEncoding:=TEncoding.SystemEncoding;
  FOwnsEncoding:=False;
end else
if (CP=TEncoding.Default.CodePage) then
begin
  FEncoding:=TEncoding.Default;
  FOwnsEncoding:=False;
end else
// ...

The current CreateRaw code is correct for my suggestion. As you can see 
you intuitively expected the approach I am suggesting :)


Ondrej

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-26 Thread Michael Van Canneyt



On Thu, 26 Dec 2019, Ondrej Pokorny wrote:


Hello,

a lot of people have a problem with the TStrings.LoadFrom*() changes 
when TEncoding support was added.


That this was going to create problems and require code changes in user
code, was clear from the start.



I suggest a compromise (steps):

1.) Keep TEncoding.ANSI always WIN-ANSI and Delphi-compatible. (Don't 
change it to DefaultSystemCodePage in Lazarus.)
2.) Change TEncoding.Default value to current TEncoding.SystemEncoding. 
I.e. TEncoding.Default would correspond to DefaultSystemCodePage and 
CP_ACP. Yes, this will be Delphi-incompatible - but CP_ACP is 
Delphi-incompatible as well (!) - so the incompatibilities are 
consequent here.
3.) Delete TEncoding.SystemEncoding because it is an FPC-only construct, 
it is not needed anymore (because it will become TEncoding.Default) and 
it has not been released in any stable version.


TEncoding.SystemEncoding was introduced to reflect changes in DefaultSystemCodePage 
whereas TEncoding.Default does not change, it reflects a fixed code page.

What I think should be done is make sure TEncoding.Default is initialized in
the sysutils unit initialization, so it is the actual system default.

So no, I don't think these need to be changed/merged. What IMO can be discussed 
is
which of these 2 need to be used as the default codepage in other code. It
should then resolve the problems that appear, I think.

Michael.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-26 Thread Ondrej Pokorny

On 26.12.2019 17:02, Mattias Gaertner via fpc-devel wrote:

On Thu, 26 Dec 2019 16:55:04 +0100
Ondrej Pokorny  wrote:


On 26.12.2019 16:41, Mattias Gaertner via fpc-devel wrote:

On Thu, 26 Dec 2019 16:15:03 +0100
Ondrej Pokorny  wrote:
  

Hello,

a lot of people have a problem with the TStrings.LoadFrom*()
changes when TEncoding support was added.

Currently, the no-encoding overloads of TStrings.LoadFrom*() and
TStrings.SaveTo*() use the TEncoding.Default, which is WIN-ANSI and
not DefaultSystemCodePage.

It seems FPC 3.3.1 does use DefaultSystemCodePage:

class function TEncoding.GetANSI: TEncoding;
begin

  if not Assigned(FStandardEncodings[seAnsi]) then
  begin
// DefaultSystemCodePage can be set to non-ANSI
if Assigned(widestringmanager.GetStandardCodePageProc) then
  FStandardEncodings[seAnsi] :=
TMBCSEncoding.Create(widestringmanager.GetStandardCodePageProc(scpAnsi))
else FStandardEncodings[seAnsi] :=
TMBCSEncoding.Create(DefaultSystemCodePage); ...
end;

Check the code more carefully. It uses DefaultSystemCodePage only
when no widestringmanager is present - which is basically never the
case (at least on win32, Linux, Mac OS).

It uses widestringmanager.GetStandardCodePageProc(scpAnsi) that is
WIN-ANSI on win32 (typically 1250, 1251, 1252 - depending on your OS
language version).

Yes, I just saw it. Bummer.


The comment
// DefaultSystemCodePage can be set to non-ANSI
is misleading and doesn't correspond to both the code and the currently 
desired behavior https://bugs.freepascal.org/view.php?id=32961#c115162


I deleted it.

Ondrej

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-26 Thread Mattias Gaertner via fpc-devel
On Thu, 26 Dec 2019 16:55:04 +0100
Ondrej Pokorny  wrote:

> On 26.12.2019 16:41, Mattias Gaertner via fpc-devel wrote:
> > On Thu, 26 Dec 2019 16:15:03 +0100
> > Ondrej Pokorny  wrote:
> >  
> >> Hello,
> >>
> >> a lot of people have a problem with the TStrings.LoadFrom*()
> >> changes when TEncoding support was added.
> >>
> >> Currently, the no-encoding overloads of TStrings.LoadFrom*() and
> >> TStrings.SaveTo*() use the TEncoding.Default, which is WIN-ANSI and
> >> not DefaultSystemCodePage.  
> > It seems FPC 3.3.1 does use DefaultSystemCodePage:
> >
> > class function TEncoding.GetANSI: TEncoding;
> > begin
> >
> >  if not Assigned(FStandardEncodings[seAnsi]) then
> >  begin
> >// DefaultSystemCodePage can be set to non-ANSI
> >if Assigned(widestringmanager.GetStandardCodePageProc) then
> >  FStandardEncodings[seAnsi] :=
> > TMBCSEncoding.Create(widestringmanager.GetStandardCodePageProc(scpAnsi))
> > else FStandardEncodings[seAnsi] :=
> > TMBCSEncoding.Create(DefaultSystemCodePage); ...
> > end;  
> 
> Check the code more carefully. It uses DefaultSystemCodePage only
> when no widestringmanager is present - which is basically never the
> case (at least on win32, Linux, Mac OS).
> 
> It uses widestringmanager.GetStandardCodePageProc(scpAnsi) that is 
> WIN-ANSI on win32 (typically 1250, 1251, 1252 - depending on your OS 
> language version).

Yes, I just saw it. Bummer.

Mattias
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-26 Thread Ondrej Pokorny


On 26.12.2019 16:41, Mattias Gaertner via fpc-devel wrote:

On Thu, 26 Dec 2019 16:15:03 +0100
Ondrej Pokorny  wrote:


Hello,

a lot of people have a problem with the TStrings.LoadFrom*() changes
when TEncoding support was added.

Currently, the no-encoding overloads of TStrings.LoadFrom*() and
TStrings.SaveTo*() use the TEncoding.Default, which is WIN-ANSI and
not DefaultSystemCodePage.

It seems FPC 3.3.1 does use DefaultSystemCodePage:

class function TEncoding.GetANSI: TEncoding;
begin
   
 if not Assigned(FStandardEncodings[seAnsi]) then
 begin
   // DefaultSystemCodePage can be set to non-ANSI
   if Assigned(widestringmanager.GetStandardCodePageProc) then
 FStandardEncodings[seAnsi] := 
TMBCSEncoding.Create(widestringmanager.GetStandardCodePageProc(scpAnsi))
   else
 FStandardEncodings[seAnsi] := 
TMBCSEncoding.Create(DefaultSystemCodePage);
...
end;


Check the code more carefully. It uses DefaultSystemCodePage only when 
no widestringmanager is present - which is basically never the case (at 
least on win32, Linux, Mac OS).


It uses widestringmanager.GetStandardCodePageProc(scpAnsi) that is 
WIN-ANSI on win32 (typically 1250, 1251, 1252 - depending on your OS 
language version).


Ondrej

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] TEncoding.Default and default encoding for TStrings.LoadFrom*()

2019-12-26 Thread Mattias Gaertner via fpc-devel
On Thu, 26 Dec 2019 16:15:03 +0100
Ondrej Pokorny  wrote:

> Hello,
> 
> a lot of people have a problem with the TStrings.LoadFrom*() changes 
> when TEncoding support was added.
> 
> Currently, the no-encoding overloads of TStrings.LoadFrom*() and 
> TStrings.SaveTo*() use the TEncoding.Default, which is WIN-ANSI and
> not DefaultSystemCodePage.

It seems FPC 3.3.1 does use DefaultSystemCodePage:

class function TEncoding.GetANSI: TEncoding;
begin
  
if not Assigned(FStandardEncodings[seAnsi]) then
begin
  // DefaultSystemCodePage can be set to non-ANSI
  if Assigned(widestringmanager.GetStandardCodePageProc) then
FStandardEncodings[seAnsi] := 
TMBCSEncoding.Create(widestringmanager.GetStandardCodePageProc(scpAnsi))
  else
FStandardEncodings[seAnsi] := 
TMBCSEncoding.Create(DefaultSystemCodePage);
   ...
end;

Maybe you are querying TEncoding.Default before changing
DefaultSystemCodePage?


Mattias
 
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel