From:             jc at mega-bucks dot co dot jp
Operating system: Linux
PHP version:      4.3.1
PHP Bug Type:     mbstring related
Bug description:  mb_detect_encoding return EUC-JP for invalid EUC-JP char sequence

Description:
------------
I've just run into a strange "bug". I have a form on my web site that
takes input from the user and then uses that to do a search of a
postgresql database.

The form is set to be EUC-JP, but this weekend a user submitted a query
that postgres reject because it "contains invalid EUC-JP" characters.
Luckily the error was logged and I was able to track it down.

I thought that maybe the user had entered some bad characters in the
form or used some strange encoding so I should better check to make sure
that the encoding of the submitted form data really is EUC-JP using
mb_detect_encoding(). But unfortunately mb_detect_encoding() says that
the invalid string *is* in EUC-JP!?

The query string is as it appears in the URL is:

search_words=%B7%F6%BA%7E

In the script that parses this query I have put the following:

$words = $_GET["words"];
$enc = mb_detect_encoding($aI["words"]);
echo "encoding is $enc and the query is ($words)";die;

The result is:

encoding is EUC-JP and the query is (喧?)

As you can see the query string is *not* a valid EUC-JP sequence ...

Reproduce code:
---------------
$words = $_GET["words"];
$enc = mb_detect_encoding($aI["words"]);
echo "encoding is $enc and the query is ($words)";die;

Expected result:
----------------
SJIS (?) or Undefined.

mb_detect_encoding() does not specify what it returns if an invalid
character sequence for which the encoding cannot be detectec is passed
in.

In the above case the character sequence is valid SJIS I believe ...

Actual result:
--------------
EUC-JP

-- 
Edit bug report at http://bugs.php.net/?id=24309&edit=1
-- 
Try a CVS snapshot:         http://bugs.php.net/fix.php?id=24309&r=trysnapshot
Fixed in CVS:               http://bugs.php.net/fix.php?id=24309&r=fixedcvs
Fixed in release:           http://bugs.php.net/fix.php?id=24309&r=alreadyfixed
Need backtrace:             http://bugs.php.net/fix.php?id=24309&r=needtrace
Try newer version:          http://bugs.php.net/fix.php?id=24309&r=oldversion
Not developer issue:        http://bugs.php.net/fix.php?id=24309&r=support
Expected behavior:          http://bugs.php.net/fix.php?id=24309&r=notwrong
Not enough info:            http://bugs.php.net/fix.php?id=24309&r=notenoughinfo
Submitted twice:            http://bugs.php.net/fix.php?id=24309&r=submittedtwice
register_globals:           http://bugs.php.net/fix.php?id=24309&r=globals
PHP 3 support discontinued: http://bugs.php.net/fix.php?id=24309&r=php3
Daylight Savings:           http://bugs.php.net/fix.php?id=24309&r=dst
IIS Stability:              http://bugs.php.net/fix.php?id=24309&r=isapi
Install GNU Sed:            http://bugs.php.net/fix.php?id=24309&r=gnused

Reply via email to