Re: [fpc-pascal] Concatenating CP Strings
> Setting the code page of a file tells the RTL about the encoding of the > strings in the file. The string's static code page (which maps to > DefaultSystemCodePage if none is specified) tells the compiler to which > encoding this string data should be converted. I know! That doesn't make it any more *useful*. -- Regards, Martok ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Concatenating CP Strings
On 15/09/18 22:32, Martok wrote: Gah, accidentally removed the comment that said what the actual problem is ;-) ReadLn(f, s); WriteLn(StringCodePage(s)); That prints 1252, which is the DefaultSystemCodePage. At that point, information loss has already occured, there is no way to fix the CP in user code. I would expect reading from a file whose codepage I have just set to return strings in that codepage. Instead, I get the declared codepage of the string. Setting the code page of a file tells the RTL about the encoding of the strings in the file. The string's static code page (which maps to DefaultSystemCodePage if none is specified) tells the compiler to which encoding this string data should be converted. Jonas ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Concatenating CP Strings
Gah, accidentally removed the comment that said what the actual problem is ;-) > ReadLn(f, s); > WriteLn(StringCodePage(s)); That prints 1252, which is the DefaultSystemCodePage. At that point, information loss has already occured, there is no way to fix the CP in user code. I would expect reading from a file whose codepage I have just set to return strings in that codepage. Instead, I get the declared codepage of the string. -- Regards, Martok ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Concatenating CP Strings
And another one: var f: TextFile; s: string; begin AssignFile(f, 'a_file.txt'); SetTextCodePage(f, 866); Reset(f); ReadLn(f, s); WriteLn(StringCodePage(s)); readln; end. That is rather useless... Writing anything into the specified codepage works perfectly fine. -- Regards, Martok ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
Re: [fpc-pascal] Concatenating CP Strings
Am 15.09.2018 um 07:34 schrieb Mattias Gaertner via fpc-pascal: > To have the result in a specific codepage use > SetCodePage(result,NeededCP,true); As I wrote, doing this by hand works, but I don't want to believe somebody thought that was how it should be. Why would "CP_UTF16 + CP_UTF8(literal) = cp1252" be the desired outcome? operator >< (a, b: RawByteString): RawByteString; var t: RawByteString; la, lt: Integer; begin t:= b; SetCodePage(t, StringCodePage(a), True); la:= Length(a); lt:= Length(t); result:= a; SetLength(Result, la + lt); Move(t[1], Result[la+1], lt); end; With that, one can write "foo:= bar >< x" and it just works. > Only on ancient Windows it was UCS2. In that case, fpwidestring is wrong as well, see fpwidestring.pp:262. MSDN is slightly unclear: "1200utf-16 Unicode UTF-16, little endian byte order (BMP of ISO 10646); available only to managed applications" The "managed applications" part is why WideCharToMultiByte simply returns an empty string when asked to convert anything to cp1200, instead of just doing the plain memcpy. "only the BMP" would be UCS2. In other places, surrogate pairs are mentioned, making it a true UTF encoding. In any case, I think the RTL should be consistent across platforms? -- Regards, Martok ___ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal