12.10.2011 15:29, Alex Shishkin пишет:
My proposed changes to spstring.
1) if string is defined w/o explicit encoding (f.e. just "string", in
H+ modeswitch or "ansistring") it treated as RawByteString.
2) In unicode Delphi mode encoding of all string constant values is
forced to UTF16, source encoding can be any. String variables forced to
UTF16.
But most unidode Delphi code could be compiled in simple Delphi mode.
3) all RTL string routines should be encoding aware (accept
RawByteString). No need to
separate unicode versions.
4) UTF8String, RawByteString, UnicodeString are aliases but not unique
types.
5) concatenation of 2 rawbytestrings converts right operand to left`s
encoding.
6) may be use concept of "universal string" from my previous message.
"ansistring" (w/o explicit encoding) = RawByteString + clause "5".
In fine, main idea is to use rawbytestings as widely as possible, but
avoid data corruption (perform codepage conversion when it absolutely
necessary).
7) string indexing
if string is "universal" indexing is always byte-based (compatible to
delphi xe ansistring and legacy ansistring). So s[i] is alwayse i`th byte.
For UTF16 string indexing word-based of course. Indexing of uft8string
is the question (i`th byte of i`th unicode - cardinal - character).
_______________________________________________
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-devel