Re: [PHP-I18N] Re: Converting "\u00F4" style characters

Jaap van Ganswijk Mon, 16 Aug 2004 05:49:38 -0700

At 2004-05-11 04:22, Asgeir Frimannsson wrote:
>        if ( $unicode < 128 ) {
>        
>            $utf8 = chr( $unicode );
>        
>        } elseif ( $unicode < 2048 ) {
>        
>            $utf8 .= chr( 192 +  ( ( $unicode - ( $unicode % 64 ) ) / 64 ) );


or as an alternative:

chr(0xc0|$unicode>>6)

>            $utf8 .= chr( 128 + ( $unicode % 64 ) );

chr(0x80|$unicode&0x3f)

>        
>        } else {
>        
>            $utf8 .= chr( 224 + ( ( $unicode - ( $unicode % 4096 ) ) / 4096 ) );

chr(0xe0|$unicode>>12)

>            $utf8 .= chr( 128 + ( ( ( $unicode % 4096 ) - ( $unicode % 64 ) ) / 64 ) 
> );

chr(0x80|$unicode>>6&0x3f)

>            $utf8 .= chr( 128 + ( $unicode % 64 ) );

chr(0x80|$unicode&0x3f)

This way it's all done with boolean bit operators in integers
(and not in floating point). Since this subroutine may
have to be called for upto each character in some document
this may be quite a a bit faster.

The code is also much shorter so easier to check and debug.
(Of course you can exchange the hex numbers for decimal ones,
but since these are special numbers within the
hexadecimal/binary system I prefer to write them in hex.)

Greetings,
Jaap

-- 
PHP Internationalization Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-I18N] Re: Converting "\u00F4" style characters

Reply via email to