From: jrose at lgb-inc dot com Operating system: Linux PHP version: 4.3.1 PHP Bug Type: *Regular Expressions Bug description: PREG_SPLIT_NO_EMPTY causes ctl chars to not be excluded with [^a-zA-Z0-9] or \W
I slammed into this whilst converting Access data into MySQL. I was attempting to break apart the following into words (* here indicates that it fails to match /[\w,.-?]/): |*|W|a|s|h|i|n|g|t|o|n|,|*|D|C|*|* // text 057617368696e67746f6e2c20444320200 // hex I first tried splitting on any white space or commas: preg_match( '/[\\s,]+/', $string, PREG_SPLIT_NO_EMPTY ); When this didn't work, I examined the hexadecimal values as above, and, assuming that control characters weren't included in the \s group, tried several things, including the very simple: preg_match( '/\\W/', $string, PREG_SPLIT_NO_EMPTY ); Nothing worked, and ultimately I had to use preg_match_all() to split the string up. Example: $string = chr(5) . 'Washington,' . chr(4) . 'DC' . chr(2); $parts = preg_split( '/\\W/', $string, PREG_SPLIT_NO_EMPTY ); echo join('|', $parts); -- Edit bug report at http://bugs.php.net/?id=23904&edit=1 -- Try a CVS snapshot: http://bugs.php.net/fix.php?id=23904&r=trysnapshot Fixed in CVS: http://bugs.php.net/fix.php?id=23904&r=fixedcvs Fixed in release: http://bugs.php.net/fix.php?id=23904&r=alreadyfixed Need backtrace: http://bugs.php.net/fix.php?id=23904&r=needtrace Try newer version: http://bugs.php.net/fix.php?id=23904&r=oldversion Not developer issue: http://bugs.php.net/fix.php?id=23904&r=support Expected behavior: http://bugs.php.net/fix.php?id=23904&r=notwrong Not enough info: http://bugs.php.net/fix.php?id=23904&r=notenoughinfo Submitted twice: http://bugs.php.net/fix.php?id=23904&r=submittedtwice register_globals: http://bugs.php.net/fix.php?id=23904&r=globals PHP 3 support discontinued: http://bugs.php.net/fix.php?id=23904&r=php3 Daylight Savings: http://bugs.php.net/fix.php?id=23904&r=dst IIS Stability: http://bugs.php.net/fix.php?id=23904&r=isapi Install GNU Sed: http://bugs.php.net/fix.php?id=23904&r=gnused