On 17/9/2011 11:46, Hans-Peter Diettrich wrote:
Luiz Americo Pereira Camara schrieb:

The codepage of a RawByteString at runtime will keep the previous CodePage (65001 for UTF8, 1200 for UTF16) as opposed to change to the RawbyteString CodePage (65535) as a though previously

Delphi defines RawByteString=AnsiString, so there is no room for UTF-16 in such an string.

No. I was wrong. See Florian email. RawByteString will keep the codepage (1200 = UTF16) and the data of the assigned string be UTF8, be UTF8.


So the implementation would be:

function FileGetAttr(const FileName: RawByteString): Longint;
begin
SetCodePage(FileName, 1200, True);

Won't work, because of "const",

Yes

and because UTF-16 is not a Byte (AnsiChar) string :-(

No. See above. Look in net for Delphi and Unicode doc by marco cantu

Result:=Integer(Windows.GetFileAttributesW(PWideChar(FileName)));

Delphi would use
Result:=Integer(Windows.GetFileAttributesW(PWideChar(string(FileName))));

with a temporary UnicodeString variable and an according try-finally block.

Yes

This way the version using UnicodeString parameter would have the benefit of being less verbose and use the possible optimizations of the implicit encoding conversion.

At best it *hides* the temporary variables and implicit conversions, but makes stringhandling more expensive.

I'm talking about:

function FileGetAttr(const FileName: UnicodeString): Longint;
begin
  Result:=Integer(Windows.GetFileAttributesW(PWideChar(FileName)));
end;


Inside the procedure there will be no conversion since is already UTF16, just a typecast to PWideChar which in fact is a function

The conversion will be done before the function call only if necessary (eg UTF8 -> UTF16). The decision to convert or not is done at compiler time.


With RawByteString

function FileGetAttr(const FileName: RawByteString): Longint;
begin
Result:=Integer(Windows.GetFileAttributesW(PWideChar(UnicodeString(FileName))));
end;


Here the decision to convert or not is done at runtime by checking the CodePage of FileName. Also there's one more temp variable due to UnicodeString typecast.

In summary:
With UnicodeString decision to convert at design time
With RawByteString decision to convert at run time + one more temp variable


As I understand the FPC developers, they want to reduce the number of implicit string conversions, what can be achieved best with dedicated string types.

I just saying that ;-) UnicodeString better than RawByteString

Luiz
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to