Re: [Lazarus] Does Lazarus support a complete Unicode Component Library?

Graeme Geldenhuys Wed, 16 Feb 2011 04:42:58 -0800

Op 2011-02-16 12:52, Hans-Peter Diettrich het geskryf:
> Most people have been sure, in the past, that they use a SBCS, where
> every character on screen is a char in memory. And consequently they use
> indexed access to the chars in an string, and for...to loops.


Yes, and that code accesses string characters 99% of the times in a
sequential manner, be that left-to-right (or other way round), hardly
ever random. So to overcome this "supposedly" limitation, one simply
needs to create a StringIterator (which I already have in my projects
where character extraction is needed) will work just fine. So I don't
see this as a problem at all.

> The same
> procedures may work for UTF-16,

No, character indexes will not work for UTF-16 either. Not ALL Unicode
Characters can fit into a 2-bytes. Also what about screen characters
that are made up of multiple code-points (combining diacritics etc)?
eg:
  U+0041 (A) + U+030A (̊)      =   Å

Depending on how that string is normalized, doing a MyString[1] might
only return 'A' and not Å as you would have expected.


> one widechar, but this code will fail miserably on an UTF-8 platform,

And so too for UTF-16 - as I have just shown. If you want to use UTF-16
like that (just because *most* of the Unicode code-points can fit into
2-bytes), then it is no better that UCS-2.



Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/


--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus

Re: [Lazarus] Does Lazarus support a complete Unicode Component Library?

Reply via email to