On Fri, Oct 05, 2007 at 01:14:23PM +0200, Luca Olivetti wrote:
> En/na [EMAIL PROTECTED] ha escrit:
>
>> * WideString allows indexed "[]" accessing individual chars.
>> This does not seem to be correct. I read that utf16 can be 4 byte long..
>> Then calculation is needed sometimes...
>
> Unless you're dealing with klingon and ancient languages,
Like Chinese? Just a billion people use it...not a real problem at all...
:-\
> I think you can assume that for 99.99% of currently spoken languages every
> character will be exactly 2 bytes long.
Wrong as I said before.
> There's a risk of having some character with more that 2 bytes but it is
> a small risk.
> With utf-8 the risk is bigger, so you have always to traverse
> the string if you need access to a specific character index.
You have to go through the string for UTF-8 and UTF-16 encodings
so the advantages are at least questionable...
ciao
--
Marco Ciampa
+--------------------+
| Linux User #78271 |
| FSFE fellow #364 |
+--------------------+
_________________________________________________________________
To unsubscribe: mail [EMAIL PROTECTED] with
"unsubscribe" as the Subject
archives at http://www.lazarus.freepascal.org/mailarchives