Re: [fpc-devel] RawByteString Insert etc.

2014-12-05 Thread Michael Schnell

On 12/05/2014 10:51 AM, Hans-Peter Diettrich wrote:
IMO the Insert procedure should change the encoding of the 
string-to-insert into the CP of the target string. Else the target 
string can become unusable, containing an mix of characters from 
different codepages. While a RawByteString can have any encoding, it 
cannot have two encodings at the same time.


Is there a "decent" (regarding multi-element codes such as ubiquitous in 
utf-8) way at all to do "Insert" with arguments specifying *element* 
position ? (As discussed already multiple times, there is no use in 
re-defining Insert and other such functions in a way that they use other 
enumeration than element-position.)


The common agreement is that the user is discouraged to do any 
assumptions about element position specifications (other than "1" = 
start of the string and High(Integer) = as high as possible). Any such 
value needs to be taken from a calling "Pos()" or "Length()".




BTW, the documentation should be updated to RawByteString arguments.


Which kind of RawByteString ? AFAIK from your mails in the last few 
days, with Delphi RawByteString is kind of "dynamic", while with fpc, 
RawByteString is statically defined to be something similar to a byte 
array.


(BTW can in Delphi RawByteString take a 2 byte encoding such as utf-2, 
which is the default "String" in Delphi and if used that way so is 
pos(MyRawByteString) counting in Bytes or in Words ?)


OTOH, I understand that "insert", "Pos", "Copy", "Length", "Delete", 
"Concat" etc are built-in functions of with the arguments are 
"virtually" handled as if they had the non existing DynamicString Type 
and - if two arguments with different encoding brand are given - do 
*appropriate* auto-conversion. Otherwise no conversion of the input data 
to a fixed (i.e. the default "String") type is forced (which would be 
done if using a "normal" function with "String" as argument type).


Hence for RawByteString the behavior is undefined. As we already 
learned, the behavior of RawByteString usually is undefined, anyway,




-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


[fpc-devel] RawByteString Insert etc.

2014-12-05 Thread Hans-Peter Diettrich
IMO the Insert procedure should change the encoding of the 
string-to-insert into the CP of the target string. Else the target 
string can become unusable, containing an mix of characters from 
different codepages. While a RawByteString can have any encoding, it 
cannot have two encodings at the same time.


BTW, the documentation should be updated to RawByteString arguments.


More candidates:
Concat (implemented where? operator +=?)
Pos (make SubStr CP match Source CP)

To be converted to RawByteString at all (overload?):
Format (?)
StringReplace
LastDelimiter, IsDelimiter (in case of non-ASCII delimiters?)
...

Should I supply patches?

DoDi

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel