From: colourmusic at gmail dot com Operating system: Win XP SP2 PHP version: 6CVS-2008-04-30 (snap) PHP Bug Type: *Unicode Issues Bug description: Replaces UTF-8 symbol with incorrect symbol
Description: ------------ I parsed url with UTF-8 encoding and noticed that UTF symbol 8 ( 8 = EF BC 98 code units) replaces to EF BC 5F code units that are not correct utf symbol. Script didn't generate errors and warnings. Also I noticed that utf symbols from 0 (0) to 7 (7) and 9 (9) parses by parse_url() without any problems. This bug also appears on PHP 5.2.3 and PHP 5.2.5 Reproduce code: --------------- <?php // mb_convert_encoding() provides same result as html_entity_decode() in this example //$url = mb_convert_encoding("https://example.com/?SHAMEI=ランドクルーザー80バン&SHAMEI_CD=01465,", "utf-8", "html-entities"); $url = html_entity_decode("https://example.com/?SHAMEI=ランドクルーザー90バン&SHAMEI_CD=01465,",null,"utf-8"); echo "Original URL = $url <br />\n"; $result = parse_url($url); echo print_r($result); ?> Expected result: ---------------- Original URL = https://example.com/?SHAMEI=ランドクルーザー80バン&SHAMEI_CD=01465, Array ( [scheme] => https [host] => example.com [path] => / [query] => SHAMEI=ランドクルーザー80バン&SHAMEI_CD=01465, ) Actual result: -------------- Original URL = https://example.com/?SHAMEI=ランドクルーザー80バン&SHAMEI_CD=01465, Array ( [scheme] => https [host] => example.com [path] => / [query] => ランドクルーザー�_0バン&SHAMEI_CD=01465, -- Edit bug report at http://bugs.php.net/?id=44868&edit=1 -- Try a CVS snapshot (PHP 5.2): http://bugs.php.net/fix.php?id=44868&r=trysnapshot52 Try a CVS snapshot (PHP 5.3): http://bugs.php.net/fix.php?id=44868&r=trysnapshot53 Try a CVS snapshot (PHP 6.0): http://bugs.php.net/fix.php?id=44868&r=trysnapshot60 Fixed in CVS: http://bugs.php.net/fix.php?id=44868&r=fixedcvs Fixed in release: http://bugs.php.net/fix.php?id=44868&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=44868&r=needtrace Need Reproduce Script: http://bugs.php.net/fix.php?id=44868&r=needscript Try newer version: http://bugs.php.net/fix.php?id=44868&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=44868&r=support Expected behavior: http://bugs.php.net/fix.php?id=44868&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=44868&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=44868&r=submittedtwice register_globals: http://bugs.php.net/fix.php?id=44868&r=globals PHP 4 support discontinued: http://bugs.php.net/fix.php?id=44868&r=php4 Daylight Savings: http://bugs.php.net/fix.php?id=44868&r=dst IIS Stability: http://bugs.php.net/fix.php?id=44868&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=44868&r=gnused Floating point limitations: http://bugs.php.net/fix.php?id=44868&r=float No Zend Extensions: http://bugs.php.net/fix.php?id=44868&r=nozend MySQL Configuration Error: http://bugs.php.net/fix.php?id=44868&r=mysqlcfg