Paul Ishenin schrieb:
13.10.2011 9:13, Hans-Peter Diettrich wrote:
Sven Barth schrieb:

http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/rtl/inc/astrings.inc?revision=19444&view=markup


I don't understand the use of encoding 0 and CP_NONE in FPC. Can
somebody explain?

DefaultSystemCodepage is used When the paticular encoding must be known both for 0 and CP_NONE codepages.

What's CP_NONE? Value and purpose?

Furthermore I suspect that the implementation of

Function Pos(Const Substr : RawByteString; Const Source : RawByteString)
: SizeInt;

is wrong. The comparison must take into account the encodings of both
strings.


Function Pos(c : AnsiChar; Const s : RawByteString) : SizeInt;
is questionable at all, when the encoding of the arguments is unknown or
ignored (as is).

Did you run some tests in delphi for these functions?

Please run the next program to be sure:

{$apptype console}

type
  T866Sting = type AnsiString(866);
  T1251String = type AnsiString(1251);
var
  s866: T866Sting;
  s1251: T1251String;
begin
  s866 := 'привет';
  writeln(s866);
  s1251 := 'рив';
  writeln(s1251);
  writeln(pos(s1251, s866));
end.

I've a problem, because my console only supports the OEM font.
The output (pasted as Ansi) looks like
»a¿óÑG
=FG
0

With the strings replaced the output is
abcdef
bcd
2

Now we have 2 problems. Pos() expects Unicode arguments, i.e. the strings should be converted before comparison. Then something is obviously wrong with the cyrillic strings, result=0 :-(

I tested again, assigning the strings to "string" variables before Pos, and now
»a¿óÑG
=FG
0
2 //using Unicode strings

It turned out that the result only is correct when at least one of the strings is an UnicodeString. Otherwise Pos seems to end up in a RawByteString compare, with the encoding ignored.

I think that I have to ask in a Delphi group.


I have the bad impression that the implementors didn't understand the
purpose and correct use of RawByteString, or try to implement something
incompatible with Delphi :-(

Do you base your impression on paticular knowlege of how delphi RTL works or only on your ideas of how it should work?

In my understanding Pos() et al. should respect the encoding, and should not do a binary compare of AnsiStrings of different encodings.

What do you think, is it okay when the result of the same function depends on the argument type, i.e.
  writeln(pos(s1251, s866)); //returns 0
  writeln(pos(string(s1251), s866)); //returns 2
???

Thanks for your interesting example :-)

DoDi

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel

Reply via email to