ID:               37791
 Comment by:       judas dot iscariote at gmail dot com
 Reported By:      john at jcoppens dot com
 Status:           Open
 Bug Type:         Regexps related
 Operating System: Linux (2.6.14)
 PHP Version:      5.1.4
 New Comment:

Hi:
You just rediscovered that PHP 5 doesn't support Unicode...
Unicode support will be available in PHP6 and up.
in the meanwhile check :

http://cl2.php.net/manual/en/function.mb-split.php


this is actually. not a bug.

Saludos ;-)


Previous Comments:
------------------------------------------------------------------------

[2006-06-12 19:29:16] john at jcoppens dot com

Description:
------------
Executing

  setlocale(LC_ALL, 'es_ES');          
  $str = "La nación es grande";
  $words = split("[^[:alpha:]']+", $str);

words in $str are not recognized as Spanish. For example,
'nación' is still split at the 'ó'.

setlocale returns es_ES indicating the locale was set.


Expected result:
----------------
The original text split into correct words. According to the
regexp docs, [:alpha:] should acquire the Spanish characters
if the locale is set.

Actual result:
--------------
Words are split at non-ASCII characters (éáí etc).



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=37791&edit=1

Reply via email to