From: jdespatis at yahoo dot fr
Operating system: Linux 2.6.15 Debian Testing
PHP version: 5.1.4
PHP Bug Type: mbstring related
Bug description: preg_split doesn't work as it should be with \W on utf-8
string
Description:
------------
preg_split("/\W/u", $utf8_string) cuts the words !
Reproduce code:
---------------
print_r(preg_split("/(\W)/u", "этот", -1,
PREG_SPLIT_DELIM_CAPTURE));
(watch out, i've put an utf8 string (you need to translate the html code
into utf8), it's a russian string, (when you see the characters, you can
see etot, with e being an epsilon inverted)
For now, i succeed in making my code work by using:
\P{L} instead of \W
Expected result:
----------------
Array
(
[0] => этот
)
Actual result:
--------------
Array
(
[0] =>
[1] => э
[2] =>
[3] => т
[4] =>
[5] => о
[6] =>
[7] => т
[8] =>
)
--
Edit bug report at http://bugs.php.net/?id=37794&edit=1
--
Try a CVS snapshot (PHP 4.4):
http://bugs.php.net/fix.php?id=37794&r=trysnapshot44
Try a CVS snapshot (PHP 5.2):
http://bugs.php.net/fix.php?id=37794&r=trysnapshot52
Try a CVS snapshot (PHP 6.0):
http://bugs.php.net/fix.php?id=37794&r=trysnapshot60
Fixed in CVS: http://bugs.php.net/fix.php?id=37794&r=fixedcvs
Fixed in release:
http://bugs.php.net/fix.php?id=37794&r=alreadyfixed
Need backtrace: http://bugs.php.net/fix.php?id=37794&r=needtrace
Need Reproduce Script: http://bugs.php.net/fix.php?id=37794&r=needscript
Try newer version: http://bugs.php.net/fix.php?id=37794&r=oldversion
Not developer issue: http://bugs.php.net/fix.php?id=37794&r=support
Expected behavior: http://bugs.php.net/fix.php?id=37794&r=notwrong
Not enough info:
http://bugs.php.net/fix.php?id=37794&r=notenoughinfo
Submitted twice:
http://bugs.php.net/fix.php?id=37794&r=submittedtwice
register_globals: http://bugs.php.net/fix.php?id=37794&r=globals
PHP 3 support discontinued: http://bugs.php.net/fix.php?id=37794&r=php3
Daylight Savings: http://bugs.php.net/fix.php?id=37794&r=dst
IIS Stability: http://bugs.php.net/fix.php?id=37794&r=isapi
Install GNU Sed: http://bugs.php.net/fix.php?id=37794&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=37794&r=float
No Zend Extensions: http://bugs.php.net/fix.php?id=37794&r=nozend
MySQL Configuration Error: http://bugs.php.net/fix.php?id=37794&r=mysqlcfg