Paul Ishenin schrieb:
13.10.2011 9:13, Hans-Peter Diettrich wrote:
Sven Barth schrieb:
http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/rtl/inc/astrings.inc?revision=19444&view=markup
I don't understand the use of encoding 0 and CP_NONE in FPC. Can
somebody explain?
DefaultSystemCodepage is used When the paticular encoding must be known
both for 0 and CP_NONE codepages.
What's CP_NONE? Value and purpose?
Furthermore I suspect that the implementation of
Function Pos(Const Substr : RawByteString; Const Source : RawByteString)
: SizeInt;
is wrong. The comparison must take into account the encodings of both
strings.
Function Pos(c : AnsiChar; Const s : RawByteString) : SizeInt;
is questionable at all, when the encoding of the arguments is unknown or
ignored (as is).
Did you run some tests in delphi for these functions?
Please run the next program to be sure:
{$apptype console}
type
T866Sting = type AnsiString(866);
T1251String = type AnsiString(1251);
var
s866: T866Sting;
s1251: T1251String;
begin
s866 := 'привет';
writeln(s866);
s1251 := 'рив';
writeln(s1251);
writeln(pos(s1251, s866));
end.
I've a problem, because my console only supports the OEM font.
The output (pasted as Ansi) looks like
»a¿óÑG
=FG
0
With the strings replaced the output is
abcdef
bcd
2
Now we have 2 problems. Pos() expects Unicode arguments, i.e. the
strings should be converted before comparison. Then something is
obviously wrong with the cyrillic strings, result=0 :-(
I tested again, assigning the strings to "string" variables before Pos,
and now
»a¿óÑG
=FG
0
2 //using Unicode strings
It turned out that the result only is correct when at least one of the
strings is an UnicodeString. Otherwise Pos seems to end up in a
RawByteString compare, with the encoding ignored.
I think that I have to ask in a Delphi group.
I have the bad impression that the implementors didn't understand the
purpose and correct use of RawByteString, or try to implement something
incompatible with Delphi :-(
Do you base your impression on paticular knowlege of how delphi RTL
works or only on your ideas of how it should work?
In my understanding Pos() et al. should respect the encoding, and should
not do a binary compare of AnsiStrings of different encodings.
What do you think, is it okay when the result of the same function
depends on the argument type, i.e.
writeln(pos(s1251, s866)); //returns 0
writeln(pos(string(s1251), s866)); //returns 2
???
Thanks for your interesting example :-)
DoDi
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel