On 01.01.2011 22:29, Juha Manninen wrote:
Vladimir Zhirov kirjoitti lauantai 01 tammikuu 2011 22:14:32:
Sven Barth wrote:
You need to convert the UTF8 string to a different one, e.g.
UTF16:
var
us: UnicodeString;
begin
us := UTF8Encode(s);
end;
Now us[2] will return the a-umlaut.
I would suggest using Utf8Copy(s, 2, 1) instead. It helps
to avoid conversion and works correctly even for characters
that take 4 bytes in UnicodeString/WideString (i.e. 2
wide characters). Utf8Copy is declared in LCLProc unit.
So the conversion is only needed if a char inside the string is accessed by
index?
If you use the LCL in your application you can also use the UTF8Copy
which was mentioned by Vladimir.
Let's say it this way: if your String contains an UTF8 encoded text you
should not use [] or the normal Pos, Copy, etc. functions, because they
might return garbage. Use functions that can work with that encoding
(either by converting the string or working directly on it).
I understand the principle but I didn't understand how the functions
UTF8Encode and UTF8Decode work. Of course I don't need to understand such
details because I am not FPC developer but anyway ...
UTF8Encode returns UTF8String and the AnsiString parameter is internally
typecasted to UnicodeString. How can that work?
You looked at the wrong function. I meant the one below it which has a
UnicodeString as argument. And this also solves the mystery:
Casting from AnsiString to UnicodeString invokes the WideString
Manager's Ansi2UnicodeMoveProc which converts the supplied AnsiString to
a correct UTF16 string. Then the function which takes an UnicodeString
as argument is invoked (it's an overloaded function after all) and the
UTF16 string is converted to UTF8.
Maybe Sven's example should use UTF8Decode. It returns UnicodeString.
According to debugger both functions convert the string to uppercase and add
some garbage to the beginning and end, but it may be debugger error.
Yes, it should have used UTF8Decode. I used the wrong function. -.-
Regards,
Sven
--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus