Michael Wallner wrote:
Hi,

I'm the author of PEARs I18Nv2 [1] module to which maintainance I kind of skittered into because I wanted to introduce a Win32/Linux independent setLocale().

Currently I want to utilze IBMs ICU resources [2] which encode unicode characters in the "\u00F4" way.

I already wrote a parser [3] for the ICU files, but searching the web I didn't find a cute way to convert these "\u00F4" characters.

Can anybody help out?

Thanks a lot,
mike

[1] http://pear.php.net/package/I18Nv2
[2] http://oss.software.ibm.com/cvs/icu/icu/source/data/locales/
[3] http://cvs.php.net/co.php/pear/I18Nv2/OpenI18N/ICUParser.php


Hi Mike,

Don't know if this is exactly what you're after, but the following example converts hex unicode, eg "00F4" (strip the "\u") to utf8:

<?php
function unicode_to_utf8( $unicode_hex ) {
        
        $unicode = hexdec($unicode_hex);

$utf8 = '';

if ( $unicode < 128 ) {

$utf8 = chr( $unicode );

} elseif ( $unicode < 2048 ) {

$utf8 .= chr( 192 + ( ( $unicode - ( $unicode % 64 ) ) / 64 ) );
$utf8 .= chr( 128 + ( $unicode % 64 ) );

} else {

$utf8 .= chr( 224 + ( ( $unicode - ( $unicode % 4096 ) ) / 4096 ) );
$utf8 .= chr( 128 + ( ( ( $unicode % 4096 ) - ( $unicode % 64 ) ) / 64 ) );
$utf8 .= chr( 128 + ( $unicode % 64 ) );

} // if


        return $utf8;

} // unicode_to_utf8
        
        
header('Content-Type: text/plain; charset=utf8');

$ch4 = '0034'; // digit '4'
$chA = '0041'; // char 'A'
$utf4 = unicode_to_utf8($ch4);
$utfA = unicode_to_utf8($chA);

print $utf4 ."\n" . $utfA;
?>

See
http://www.unicode.org/Public/UNIDATA/UnicodeData.txt
- unicode mapping table

http://www.randomchaos.com/document.php?source=php_and_unicode
- where the unicode_to_utf8 method is inspired from...

regards,
asgeir

--
PHP Internationalization Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Reply via email to