Hi,
I'm the author of PEARs I18Nv2 [1] module to which maintainance I kind of skittered into because I wanted to introduce a Win32/Linux independent setLocale().
Currently I want to utilze IBMs ICU resources [2] which encode unicode characters in the "\u00F4" way.
I already wrote a parser [3] for the ICU files, but searching the web I didn't find a cute way to convert these "\u00F4" characters.
Can anybody help out?
Thanks a lot, mike
[1] http://pear.php.net/package/I18Nv2 [2] http://oss.software.ibm.com/cvs/icu/icu/source/data/locales/ [3] http://cvs.php.net/co.php/pear/I18Nv2/OpenI18N/ICUParser.php
Hi Mike,
Don't know if this is exactly what you're after, but the following example converts hex unicode, eg "00F4" (strip the "\u") to utf8:
<?php function unicode_to_utf8( $unicode_hex ) { $unicode = hexdec($unicode_hex);
$utf8 = '';
if ( $unicode < 128 ) {
$utf8 = chr( $unicode );
} elseif ( $unicode < 2048 ) {
$utf8 .= chr( 192 + ( ( $unicode - ( $unicode % 64 ) ) / 64 ) );
$utf8 .= chr( 128 + ( $unicode % 64 ) );
} else {
$utf8 .= chr( 224 + ( ( $unicode - ( $unicode % 4096 ) ) / 4096 ) );
$utf8 .= chr( 128 + ( ( ( $unicode % 4096 ) - ( $unicode % 64 ) ) / 64 ) );
$utf8 .= chr( 128 + ( $unicode % 64 ) );
} // if
return $utf8;
} // unicode_to_utf8 header('Content-Type: text/plain; charset=utf8');
$ch4 = '0034'; // digit '4' $chA = '0041'; // char 'A' $utf4 = unicode_to_utf8($ch4); $utfA = unicode_to_utf8($chA);
print $utf4 ."\n" . $utfA; ?>
See http://www.unicode.org/Public/UNIDATA/UnicodeData.txt - unicode mapping table
http://www.randomchaos.com/document.php?source=php_and_unicode - where the unicode_to_utf8 method is inspired from...
regards, asgeir
-- PHP Internationalization Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php