ID: 31064
Updated by: [EMAIL PROTECTED]
Reported By: km at control-b dot de
-Status: Open
+Status: Bogus
Bug Type: Strings related
Operating System: windows
PHP Version: 5.0.2
New Comment:
This is not a bug, PHP doesn't know anything about UTF8 encodings and
will split up a word if it's not [A-z].
Previous Comments:
------------------------------------------------------------------------
[2004-12-12 01:57:22] km at control-b dot de
Description:
------------
str_word_count return wrong number, if german umlaut "ö" (ö) is
contained in the word.
it is okay, if the umlaut is the first or the last character.
the same goes für the ligature "ß" ß.
Reproduce code:
---------------
echo str_word_count('wäre'); # 1 - okay
echo str_word_count('würde'); # 1 - okay
echo str_word_count('wérk'); # 1 - okay
echo str_word_count('wörk'); # 2 - wrong!!!
echo str_word_count('örk'); # 1 - okay
echo str_word_count('werök'); # 2 - wrong!!!
echo str_word_count('weräk'); # 1 - okay
echo str_word_count('straßenbahnölbehälter'); # 3 words???
Expected result:
----------------
the above code should return always 1
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=31064&edit=1