On Tue, Mar 16, 2010 at 8:30 AM, Lester Caine <les...@lsces.co.uk> wrote:
> '3' is not a very processor friendly number, so working with 4 even though
> wasteful on memory, does make perfect sense. How long is it since we had a
> 640k limit on working memory? SERVERS should have a good amount of memory
> for caching information anyway. SO is UTF-16 the right approach for
> processing wide strings? It needs special code to handle everything wider
> than 16 bits, but at what gain really? If all core functionality is handled
> as 32 bit characters is there that much of an overhead over the additional
> processing to get around strings of dissimilar sizes in UTF-16 ?

Just to re-enforce some of Lester's points above here.

4-byte per character is never slower that 2-bytes per character... its
faster if anything. Bear in mind that 4-byte has been the defacto size
for all modern cpu registers / 32-bit microarchitectures since....
like... Forever. Give a c compiler 4bytes of data... it'll say: thank
you very much, and more of the same please! It keeps em happy ;)

Sure UTF-16 can make sense. But only if your external representations
are also in UTF-16. So whats the default Unicode settings for MYSQL,
POSTGRE, etc? Well, are they always set to UTF-8, or UTF-16?

Just do the same as them.

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to