ID: 37791 Updated by: [EMAIL PROTECTED] Reported By: john at jcoppens dot com -Status: Open +Status: Bogus Bug Type: Regexps related Operating System: Linux (2.6.14) PHP Version: 5.1.4 New Comment:
use preg_split() instead. Previous Comments: ------------------------------------------------------------------------ [2006-06-12 19:43:52] judas dot iscariote at gmail dot com Hi: You just rediscovered that PHP 5 doesn't support Unicode... Unicode support will be available in PHP6 and up. in the meanwhile check : http://cl2.php.net/manual/en/function.mb-split.php this is actually. not a bug. Saludos ;-) ------------------------------------------------------------------------ [2006-06-12 19:29:16] john at jcoppens dot com Description: ------------ Executing setlocale(LC_ALL, 'es_ES'); $str = "La nación es grande"; $words = split("[^[:alpha:]']+", $str); words in $str are not recognized as Spanish. For example, 'nación' is still split at the 'ó'. setlocale returns es_ES indicating the locale was set. Expected result: ---------------- The original text split into correct words. According to the regexp docs, [:alpha:] should acquire the Spanish characters if the locale is set. Actual result: -------------- Words are split at non-ASCII characters (éáí etc). ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=37791&edit=1