Re: [lazarus] UTF-8 vs UTF-16 support

Vincent Snijders Fri, 05 Oct 2007 00:57:53 -0700

Michael Van Canneyt schreef:


On Fri, 5 Oct 2007, Graeme Geldenhuys wrote:

Hi,

I asked a similar question in the MSEgui newsgroup as well.  What was
the reason for choosing to support UTF-8 instead of UTF-16?

----- Quoted Mattias from 6 months ago  --------------
The LCL will support UTF-8 and provide some extra functions for UTF-16,
because UTF-8 is more compatible to existing pascal programs
-----------   END   --------------


Does this mean UTF-8 was chosen only because it is more compatible
with existing pascal programs?  Any other reasons?


It uses less memory.

These are the pro points I received for using UTF-16 in MSEgui.

* It is faster to work with UTF-16 (and so WideString) encoded text
compared to UTF-8.
* Easier to implement.
* WideString allows indexed "[]" accessing individual chars.
* Has predictable "length()" value.  (not sure what they meant here)

It means BufferSize = Length*Sizeof(Widechar).On UTF-8, you need to calculate it.

I think they mean numofchar(widestring) = bytes allocated / 2. For an UTF8 stringyou need to parse it, to get the length.


So length(widestring) is a O(1) operation, lenght(UTF8String) is a O(n) 
operation.

Vincent

_________________________________________________________________
    To unsubscribe: mail [EMAIL PROTECTED] with
               "unsubscribe" as the Subject
  archives at http://www.lazarus.freepascal.org/mailarchives

Re: [lazarus] UTF-8 vs UTF-16 support

Reply via email to