2011/9/15 Hans-Peter Diettrich <drdiettri...@aol.com>: > cobines schrieb: >> When doing: >> MyChar := MyString[1] >> >> appropriate function retrieves first unicode character, regardless of >> encoding. > > This is just wrong :-( > > MyString[1] accesses the first element of the *physical* character array, > regardless of any encoding. Also Length returns the array size, not the > number of *logical* characters in it.
Right. My point was if I come from Ansi knowing MyString[1] retrieves first character and know nothing about Unicode, I might still think it continues to retrieve first character in Unicode regardless of string encoding (RTL handles that). It is as you say wrong, therefore the need to adapt the code by developer if he uses such access, but people might don't know this. Having UTF-16 RTL might help them in a sense they they will never have to learn, until they deal with characters outside of the BMP. >> Whether it's utf8, utf16, utf32 or any other future encoding the code >> should work the same. > > Very new functions are required for dealing with *logical* characters, in > every MBCS encoding. Hence the need to remove indexed access like MyString[1]. -- cobines _______________________________________________ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-devel