On Tue, Mar 16, 2010 at 8:30 AM, Lester Caine <les...@lsces.co.uk> wrote: > '3' is not a very processor friendly number, so working with 4 even though > wasteful on memory, does make perfect sense. How long is it since we had a > 640k limit on working memory? SERVERS should have a good amount of memory > for caching information anyway. SO is UTF-16 the right approach for > processing wide strings? It needs special code to handle everything wider > than 16 bits, but at what gain really? If all core functionality is handled > as 32 bit characters is there that much of an overhead over the additional > processing to get around strings of dissimilar sizes in UTF-16 ?
Just to re-enforce some of Lester's points above here. 4-byte per character is never slower that 2-bytes per character... its faster if anything. Bear in mind that 4-byte has been the defacto size for all modern cpu registers / 32-bit microarchitectures since.... like... Forever. Give a c compiler 4bytes of data... it'll say: thank you very much, and more of the same please! It keeps em happy ;) Sure UTF-16 can make sense. But only if your external representations are also in UTF-16. So whats the default Unicode settings for MYSQL, POSTGRE, etc? Well, are they always set to UTF-8, or UTF-16? Just do the same as them. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php