On 2011-10-21 00:20, Hans-Peter Diettrich wrote: > your legacy code can assume that every (visible) character is a Char, in > an SBCS codepage, this is not different in UTF-16.
Rookie mistake!!! You forgot surrogate pairs in UTF-16. Think outside the Unicode BMP where a "visible" character will be 4-bytes, thus two UTF-16 Char values. As as I mentioned earlier, most programmers using UTF-16 treat it like UCS2, forgetting that they need to check for surrogate pairs too. Now in UTF-8, this is not a problem at all. Finding a visible character in the BMP or Supplementary Plane is a identical process, no special checking is required. Thus making UTF-8 much easier and safer to use. I've ported enough Delphi code to FPC + fpGUI where UTF-8 is used for Unicode support. I fully agree with Felipe, using UTF-8 is much easier with legacy code that UTF-16. Regards, - Graeme - -- fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal http://fpgui.sourceforge.net/ -- _______________________________________________ Lazarus mailing list [email protected] http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus
