Re: [Lazarus] cwstring in arm-linux

Graeme Geldenhuys Fri, 21 Oct 2011 00:04:17 -0700

On 2011-10-21 00:20, Hans-Peter Diettrich wrote:
> your legacy code can assume that every (visible) character is a Char, in 
> an SBCS codepage, this is not different in UTF-16.


Rookie mistake!!! You forgot surrogate pairs in UTF-16. Think outside
the Unicode BMP where a "visible" character will be 4-bytes, thus two
UTF-16 Char values. As as I mentioned earlier, most programmers using
UTF-16 treat it like UCS2, forgetting that they need to check for
surrogate pairs too.

Now in UTF-8, this is not a problem at all. Finding a visible character
in the BMP or Supplementary Plane is a identical process, no special
checking is required. Thus making UTF-8 much easier and safer to use.

I've ported enough Delphi code to FPC + fpGUI where UTF-8 is used for
Unicode support. I fully agree with Felipe, using UTF-8 is much easier
with legacy code that UTF-16.

Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/


--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Re: [Lazarus] cwstring in arm-linux

Reply via email to