From: ywliu at hotmail dot com Operating system: linux PHP version: 4.3.4 PHP Bug Type: *Languages/Translation Bug description: htmlentities fail to escape BIG5 characters correctly
Description: ------------ In ext/standard/html.c , htmlentities() fails to identify BIG5 Chinese characters correctly. I have checked CVS version 1.87, the bug is still there. Reproduce code: --------------- In html.c, look for this piece of code : case cs_big5: case cs_gb2312: case cs_big5hkscs: { /* check if this is the first of a 2-byte sequence */ if (this_char >= 0xa1 && this_char <= 0xf9) { /* peek at the next char */ unsigned char next_char = str[pos]; if ((next_char >= 0x40 && next_char <= 0x73) ||(next_char >= 0xa1 && next_char <= 0xfe)) { Expected result: ---------------- In fact, the first byte should be from 0xa1 to 0xfe, and the second byte should be from 0x40-0x7e and 0xa1-0xfe. (from page 88, "Understanding Japanese Information Processing" by Ken Lunde , O'Reilly.) Actual result: -------------- So it should be : if (this_char >= 0xa1 && this_char <= 0xfe) { and if ((next_char >= 0x40 && next_char <= 0x7e) ||(next_char >= 0xa1 && next_char <= 0xfe)) { -- Edit bug report at http://bugs.php.net/?id=27505&edit=1 -- Try a CVS snapshot (php4): http://bugs.php.net/fix.php?id=27505&r=trysnapshot4 Try a CVS snapshot (php5): http://bugs.php.net/fix.php?id=27505&r=trysnapshot5 Fixed in CVS: http://bugs.php.net/fix.php?id=27505&r=fixedcvs Fixed in release: http://bugs.php.net/fix.php?id=27505&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=27505&r=needtrace Need Reproduce Script: http://bugs.php.net/fix.php?id=27505&r=needscript Try newer version: http://bugs.php.net/fix.php?id=27505&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=27505&r=support Expected behavior: http://bugs.php.net/fix.php?id=27505&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=27505&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=27505&r=submittedtwice register_globals: http://bugs.php.net/fix.php?id=27505&r=globals PHP 3 support discontinued: http://bugs.php.net/fix.php?id=27505&r=php3 Daylight Savings: http://bugs.php.net/fix.php?id=27505&r=dst IIS Stability: http://bugs.php.net/fix.php?id=27505&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=27505&r=gnused Floating point limitations: http://bugs.php.net/fix.php?id=27505&r=float