ID:               37791
 Updated by:       [EMAIL PROTECTED]
 Reported By:      john at jcoppens dot com
-Status:           Open
+Status:           Bogus
 Bug Type:         Regexps related
 Operating System: Linux (2.6.14)
 PHP Version:      5.1.4
 New Comment:

use preg_split() instead.


Previous Comments:
------------------------------------------------------------------------

[2006-06-12 19:43:52] judas dot iscariote at gmail dot com

Hi:
You just rediscovered that PHP 5 doesn't support Unicode...
Unicode support will be available in PHP6 and up.
in the meanwhile check :

http://cl2.php.net/manual/en/function.mb-split.php


this is actually. not a bug.

Saludos ;-)

------------------------------------------------------------------------

[2006-06-12 19:29:16] john at jcoppens dot com

Description:
------------
Executing

  setlocale(LC_ALL, 'es_ES');          
  $str = "La nación es grande";
  $words = split("[^[:alpha:]']+", $str);

words in $str are not recognized as Spanish. For example,
'nación' is still split at the 'ó'.

setlocale returns es_ES indicating the locale was set.


Expected result:
----------------
The original text split into correct words. According to the
regexp docs, [:alpha:] should acquire the Spanish characters
if the locale is set.

Actual result:
--------------
Words are split at non-ASCII characters (éáí etc).



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=37791&edit=1

Reply via email to