Re: [fpc-pascal] Concatenating CP Strings

2018-09-15 Thread Martok

> Setting the code page of a file tells the RTL about the encoding of the 
> strings in the file. The string's static code page (which maps to 
> DefaultSystemCodePage if none is specified) tells the compiler to which 
> encoding this string data should be converted.
I know!

That doesn't make it any more *useful*.

-- 
Regards,
Martok

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Concatenating CP Strings

2018-09-15 Thread Jonas Maebe

On 15/09/18 22:32, Martok wrote:

Gah, accidentally removed the comment that said what the actual problem is ;-)


   ReadLn(f, s);
   WriteLn(StringCodePage(s));

That prints 1252, which is the DefaultSystemCodePage. At that point, information
loss has already occured, there is no way to fix the CP in user code.
I would expect reading from a file whose codepage I have just set to return
strings in that codepage. Instead, I get the declared codepage of the string.


Setting the code page of a file tells the RTL about the encoding of the 
strings in the file. The string's static code page (which maps to 
DefaultSystemCodePage if none is specified) tells the compiler to which 
encoding this string data should be converted.



Jonas
___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Concatenating CP Strings

2018-09-15 Thread Martok
Gah, accidentally removed the comment that said what the actual problem is ;-)

>   ReadLn(f, s);
>   WriteLn(StringCodePage(s));

That prints 1252, which is the DefaultSystemCodePage. At that point, information
loss has already occured, there is no way to fix the CP in user code.
I would expect reading from a file whose codepage I have just set to return
strings in that codepage. Instead, I get the declared codepage of the string.


-- 
Regards,
Martok


___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Concatenating CP Strings

2018-09-15 Thread Martok
And another one:

var
  f: TextFile;
  s: string;
begin
  AssignFile(f, 'a_file.txt');
  SetTextCodePage(f, 866);
  Reset(f);
  ReadLn(f, s);
  WriteLn(StringCodePage(s));
  readln;
end.

That is rather useless...


Writing anything into the specified codepage works perfectly fine.


-- 
Regards,
Martok

___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Re: [fpc-pascal] Concatenating CP Strings

2018-09-15 Thread Martok
Am 15.09.2018 um 07:34 schrieb Mattias Gaertner via fpc-pascal:
> To have the result in a specific codepage use
> SetCodePage(result,NeededCP,true);
As I wrote, doing this by hand works, but I don't want to believe somebody
thought that was how it should be. Why would "CP_UTF16 + CP_UTF8(literal) =
cp1252" be the desired outcome?

  operator >< (a, b: RawByteString): RawByteString;
  var
t: RawByteString;
la, lt: Integer;
  begin
t:= b;
SetCodePage(t, StringCodePage(a), True);
la:= Length(a);
lt:= Length(t);
result:= a;
SetLength(Result, la + lt);
Move(t[1], Result[la+1], lt);
  end;

With that, one can write "foo:= bar >< x" and it just works.


> Only on ancient Windows it was UCS2.
In that case, fpwidestring is wrong as well, see fpwidestring.pp:262.

MSDN is slightly unclear:
   "1200utf-16
   Unicode UTF-16, little endian byte order (BMP of ISO 10646);
   available only to managed applications"

The "managed applications" part is why WideCharToMultiByte simply returns an
empty string when asked to convert anything to cp1200, instead of just doing the
plain memcpy.

"only the BMP" would be UCS2. In other places, surrogate pairs are mentioned,
making it a true UTF encoding.

In any case, I think the RTL should be consistent across platforms?

--
Regards,
Martok



___
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal