From: Operating system: Linux PHP version: 5.3.3 Package: PCRE related Bug Type: Bug Bug description:PCRE-Meta-Characters not working with utf-8
Description: ------------ PCRE-Meta-Characters like \b \w not working with unicode strings. PHP-5.3.3 (32Bit) pcre PCRE (Perl Compatible Regular Expressions) Support => enabled PCRE Library Version => 8.02 2010-03-19 Directive => Local Value => Master Value pcre.backtrack_limit => 100000 => 100000 pcre.recursion_limit => 100000 => 100000 iconv iconv support => enabled iconv implementation => glibc iconv library version => 2.10.1 Directive => Local Value => Master Value iconv.input_encoding => ISO-8859-1 => ISO-8859-1 iconv.internal_encoding => ISO-8859-1 => ISO-8859-1 iconv.output_encoding => ISO-8859-1 => ISO-8859-1 Test script: --------------- <?php // encoding: UTF-8 $message = 'Der ist ein SüÃwasserpool Süsswasserpool ... verschiedene Wassersportmöglichkeiten bei ...'; $pattern = '/\bwasser/iu'; preg_match_all($pattern, $message, $match, PREG_OFFSET_CAPTURE); var_dump($match); $pattern = '/[^\w]wasser/iu'; preg_match_all($pattern, $message, $match, PREG_OFFSET_CAPTURE); var_dump($match); Expected result: ---------------- array(1) { [0]=> array(1) { [0]=> array(2) { [0]=> string(6) "Wasser" [1]=> int(61) } } } array(1) { [0]=> array(1) { [0]=> array(2) { [0]=> string(7) " Wasser" [1]=> int(60) } } } Actual result: -------------- array(1) { [0]=> array(2) { [0]=> array(2) { [0]=> string(6) "wasser" [1]=> int(17) } [1]=> array(2) { [0]=> string(6) "Wasser" [1]=> int(61) } } } array(1) { [0]=> array(2) { [0]=> array(2) { [0]=> string(8) "Ãwasser" [1]=> int(15) } [1]=> array(2) { [0]=> string(7) " Wasser" [1]=> int(60) } } } -- Edit bug report at http://bugs.php.net/bug.php?id=52971&edit=1 -- Try a snapshot (PHP 5.2): http://bugs.php.net/fix.php?id=52971&r=trysnapshot52 Try a snapshot (PHP 5.3): http://bugs.php.net/fix.php?id=52971&r=trysnapshot53 Try a snapshot (trunk): http://bugs.php.net/fix.php?id=52971&r=trysnapshottrunk Fixed in SVN: http://bugs.php.net/fix.php?id=52971&r=fixed Fixed in SVN and need be documented: http://bugs.php.net/fix.php?id=52971&r=needdocs Fixed in release: http://bugs.php.net/fix.php?id=52971&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=52971&r=needtrace Need Reproduce Script: http://bugs.php.net/fix.php?id=52971&r=needscript Try newer version: http://bugs.php.net/fix.php?id=52971&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=52971&r=support Expected behavior: http://bugs.php.net/fix.php?id=52971&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=52971&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=52971&r=submittedtwice register_globals: http://bugs.php.net/fix.php?id=52971&r=globals PHP 4 support discontinued: http://bugs.php.net/fix.php?id=52971&r=php4 Daylight Savings: http://bugs.php.net/fix.php?id=52971&r=dst IIS Stability: http://bugs.php.net/fix.php?id=52971&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=52971&r=gnused Floating point limitations: http://bugs.php.net/fix.php?id=52971&r=float No Zend Extensions: http://bugs.php.net/fix.php?id=52971&r=nozend MySQL Configuration Error: http://bugs.php.net/fix.php?id=52971&r=mysqlcfg