Op 2011-02-16 12:52, Hans-Peter Diettrich het geskryf: > Most people have been sure, in the past, that they use a SBCS, where > every character on screen is a char in memory. And consequently they use > indexed access to the chars in an string, and for...to loops.
Yes, and that code accesses string characters 99% of the times in a sequential manner, be that left-to-right (or other way round), hardly ever random. So to overcome this "supposedly" limitation, one simply needs to create a StringIterator (which I already have in my projects where character extraction is needed) will work just fine. So I don't see this as a problem at all. > The same > procedures may work for UTF-16, No, character indexes will not work for UTF-16 either. Not ALL Unicode Characters can fit into a 2-bytes. Also what about screen characters that are made up of multiple code-points (combining diacritics etc)? eg: U+0041 (A) + U+030A (̊) = Å Depending on how that string is normalized, doing a MyString[1] might only return 'A' and not Å as you would have expected. > one widechar, but this code will fail miserably on an UTF-8 platform, And so too for UTF-16 - as I have just shown. If you want to use UTF-16 like that (just because *most* of the Unicode code-points can fit into 2-bytes), then it is no better that UCS-2. Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ -- _______________________________________________ Lazarus mailing list [email protected] http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
