ID: 45850 Updated by: [EMAIL PROTECTED] Reported By: thunder013 at yopmail dot com -Status: Open +Status: Closed Bug Type: PCRE related Operating System: * PHP Version: 5.2.6 New Comment:
This bug has been fixed in CVS. Snapshots of the sources are packaged every three hours; this change will be in the next snapshot. You can grab the snapshot at http://snaps.php.net/. Thank you for the report, and for helping us make PHP better. Merged the VERY simple fix to PHP_5_2. Previous Comments: ------------------------------------------------------------------------ [2008-08-18 22:09:28] [EMAIL PROTECTED] The #42737 was fixed in 5_3 and HEAD. Thus using your example code in the 5.3 I got the expected result. ------------------------------------------------------------------------ [2008-08-18 11:47:22] thunder013 at yopmail dot com Oh... In fact the bug #42737 that I link concerns the sequence "\n\r", not "\n\t"... ------------------------------------------------------------------------ [2008-08-18 11:29:53] thunder013 at yopmail dot com Description: ------------ I want to use preg_split with the u modifier to split a UTF-8 string by each character, like that: preg_split('//u', $txt, -1, PREG_SPLIT_NO_EMPTY); It works fine, however, there is a bug when the string contains the sequence "\n\t" (0x0A09 in hex): the two characters are NOT splitted (see the example attached). Note that this bug isn't present when preg_split is used whithout the u modifier. This bug was reported previously here for version 5.2.4: http://bugs.php.net/bug.php?id=42737, but netherless is *still* present in version 5.2.5 and 5.2.6! Reproduce code: --------------- <? $txt = "abc\n\txyz!"; $tab = preg_split('//u', $txt, -1, PREG_SPLIT_NO_EMPTY); print_r($tab); echo '$tab[3]: len = ', strlen($tab[3]), ', hex = ', bin2hex($tab[3]), "\n"; ?> Expected result: ---------------- Array ( [0] => a [1] => b [2] => c [3] => [4] => [5] => x [6] => y [7] => z [8] => ! ) $tab[3]: len = 1, hex = 0a Actual result: -------------- $ php test.php Array ( [0] => a [1] => b [2] => c [3] => [4] => x [5] => y [6] => z [7] => ! ) $tab[3]: len = 2, hex = 0a09 ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=45850&edit=1