Edit report at http://bugs.php.net/bug.php?id=54688&edit=1
ID: 54688
User updated by: g dot huebgen at arcor dot de
Reported by: g dot huebgen at arcor dot de
Summary: case insensitive search of stripos does not work
when searching äöü in utf-8
Status: Bogus
Type: Bug
Package: Strings related
Operating System: Linux
PHP Version: 5.3.6
Block user comment: N
Private report: N
New Comment:
You are right. Your mb_stripos works fine.
My mistake in this was that I forgot the parameter "UTF-8"!
Now everything is clear.
Thank you
Gerhard
Previous Comments:
------------------------------------------------------------------------
[2011-05-09 06:44:51] [email protected]
Well, somewhere along the way you have messed up your encoding since it
works
fine when both strings are UTF-8:
var_dump(mb_stripos("Ãbermut","über",0,"UTF-8"));
Are you saying that this doesn't give you int(0) on your platform?
------------------------------------------------------------------------
[2011-05-09 06:33:40] g dot huebgen at arcor dot de
The description of utf8_decode states clearly that this function decodes
UTF8 text. The manual says:
"utf8_decode â Converts a string with ISO-8859-1 characters encoded
with UTF-8 to single-byte ISO-8859-1"
So my text is indeed in UTF-8 and my remark on utf8_decode only confirms
what rasmus (comment #1) said.
------------------------------------------------------------------------
[2011-05-08 20:25:23] [email protected]
That means your string is not actually in UTF-8. utf8_decode() converts
text in
ISO-8859-1 to UTF-8. You stated initially that you had text encoded in
UTF-8.
------------------------------------------------------------------------
[2011-05-08 20:17:42] g dot huebgen at arcor dot de
Hi rasmus.
Now I tried mb_stripos but the result is not different to stripos.
The same program but using mb_stripos:
$text = file_get_contents("test-utf8.txt");
$str = "über";
if (($pos=mb_stripos($text,$str)) !== false)
echo $str." found";
else echo $str." not found";
output is: not found!
If I use utf8_decode for both $text and $str then stripos will work
properly.
------------------------------------------------------------------------
[2011-05-08 17:43:47] [email protected]
This is not a bug. The base string handling functions in PHP do not
support
multibyte character sets. Since UTF-8 is compatible with single-byte
charsets at
the low end, it may appear to work for UTF-8, but it will break as soon
as you hit
an actual mb character. You can use mb_stripos() in this case, or you
can use the
function overloading support in mbstring to make your stripos mb aware.
See http://de.php.net/manual/en/mbstring.overload.php
------------------------------------------------------------------------
The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
http://bugs.php.net/bug.php?id=54688
--
Edit this bug report at http://bugs.php.net/bug.php?id=54688&edit=1