ID:               27505
 Updated by:       [EMAIL PROTECTED]
 Reported By:      ywliu at hotmail dot com
-Status:           Open
+Status:           Closed
 Bug Type:         *Languages/Translation
 Operating System: linux
 PHP Version:      4.3.4
 New Comment:

This bug has been fixed in CVS.

Snapshots of the sources are packaged every three hours; this change
will be in the next snapshot. You can grab the snapshot at
http://snaps.php.net/.
 
Thank you for the report, and for helping us make PHP better.




Previous Comments:
------------------------------------------------------------------------

[2004-03-05 03:43:06] ywliu at hotmail dot com

Description:
------------
In ext/standard/html.c , htmlentities() fails to identify BIG5 Chinese
characters correctly.



I have checked CVS version 1.87, the bug is still there.

Reproduce code:
---------------
In html.c, look for this piece of code :



case cs_big5:

case cs_gb2312:

case cs_big5hkscs:

    {

        /* check if this is the first of a 2-byte sequence */

        if (this_char >= 0xa1 && this_char <= 0xf9) {

        /* peek at the next char */

        unsigned char next_char = str[pos];

                if ((next_char >= 0x40 && next_char <= 0x73) ||(next_char >= 0xa1 &&
next_char <= 0xfe)) {

                        

Expected result:
----------------
In fact, the first byte should be from 0xa1 to 0xfe, and the second
byte should be from 0x40-0x7e and 0xa1-0xfe.



(from page 88, "Understanding Japanese Information Processing" by Ken
Lunde , O'Reilly.)

Actual result:
--------------
So it should be :



        if (this_char >= 0xa1 && this_char <= 0xfe) {



and 



                if ((next_char >= 0x40 && next_char <= 0x7e) ||(next_char >= 0xa1 &&
next_char <= 0xfe)) {

                        


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=27505&edit=1

Reply via email to