At 2004-05-11 04:22, Asgeir Frimannsson wrote:
>        if ( $unicode < 128 ) {
>        
>            $utf8 = chr( $unicode );
>        
>        } elseif ( $unicode < 2048 ) {
>        
>            $utf8 .= chr( 192 +  ( ( $unicode - ( $unicode % 64 ) ) / 64 ) );

or as an alternative:

chr(0xc0|$unicode>>6)

>            $utf8 .= chr( 128 + ( $unicode % 64 ) );

chr(0x80|$unicode&0x3f)

>        
>        } else {
>        
>            $utf8 .= chr( 224 + ( ( $unicode - ( $unicode % 4096 ) ) / 4096 ) );

chr(0xe0|$unicode>>12)

>            $utf8 .= chr( 128 + ( ( ( $unicode % 4096 ) - ( $unicode % 64 ) ) / 64 ) 
> );

chr(0x80|$unicode>>6&0x3f)

>            $utf8 .= chr( 128 + ( $unicode % 64 ) );

chr(0x80|$unicode&0x3f)

This way it's all done with boolean bit operators in integers
(and not in floating point). Since this subroutine may
have to be called for upto each character in some document
this may be quite a a bit faster.

The code is also much shorter so easier to check and debug.
(Of course you can exchange the hex numbers for decimal ones,
but since these are special numbers within the
hexadecimal/binary system I prefer to write them in hex.)

Greetings,
Jaap

-- 
PHP Internationalization Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to