Am 15.09.2018 um 07:34 schrieb Mattias Gaertner via fpc-pascal:
> To have the result in a specific codepage use
> SetCodePage(result,NeededCP,true);
As I wrote, doing this by hand works, but I don't want to believe somebody
thought that was how it should be. Why would "CP_UTF16 + CP_UTF8(literal) =
cp1252" be the desired outcome?

  operator >< (a, b: RawByteString): RawByteString;
  var
    t: RawByteString;
    la, lt: Integer;
  begin
    t:= b;
    SetCodePage(t, StringCodePage(a), True);
    la:= Length(a);
    lt:= Length(t);
    result:= a;
    SetLength(Result, la + lt);
    Move(t[1], Result[la+1], lt);
  end;

With that, one can write "foo:= bar >< x" and it just works.


> Only on ancient Windows it was UCS2.
In that case, fpwidestring is wrong as well, see fpwidestring.pp:262.

MSDN is slightly unclear:
   "1200        utf-16
   Unicode UTF-16, little endian byte order (BMP of ISO 10646);
   available only to managed applications"

The "managed applications" part is why WideCharToMultiByte simply returns an
empty string when asked to convert anything to cp1200, instead of just doing the
plain memcpy.

"only the BMP" would be UCS2. In other places, surrogate pairs are mentioned,
making it a true UTF encoding.

In any case, I think the RTL should be consistent across platforms?

--
Regards,
Martok



_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Reply via email to