ID: 39744
User updated by: sdamir at gmail dot com
Reported By: sdamir at gmail dot com
Status: Open
Bug Type: *Regular Expressions
Operating System: Linux 2.6.18
PHP Version: 5.2.0
New Comment:
I dont know why but your bug-system converted letters in my php code
into &#crap; stuff.
Previous Comments:
------------------------------------------------------------------------
[2006-12-05 15:48:49] sdamir at gmail dot com
Description:
------------
I am trying to match all alphabetic utf8 characters. I know (tested)
that in perl if $string is utf8 encoded and if i use regex like =~ /\w/
it will match all alphabetic utf8 characters, (cirilic alphabet,
chinese, english etc.). However this is not the case for php. I read i
need to use special patterns like \pL , well this doesn't work for me
either, it matches some characters but cirilic letters aren't matched.
I don't know if this is a bug or i am doing something wrong but i
really searched the hell out of everything, visited tons of irc support
channels no one has an answer to this.
Reproduce code:
---------------
<?php
// setlocale(LC_ALL, 'en_US.utf8'); // if i set locale to en_US, it
matches some characters like öåä but not rilic, en_US.utf8 wont match
anything.
$str=" Срећа ";
utf8_encode($str);
var_dump($str);
preg_match("/[\w\pL]/u",$str, $r);
var_dump($r);
?>
Expected result:
----------------
string(3) " s "
array(1) {
[0]=>
string(1) "С"
}
Actual result:
--------------
string(12) " Срећа "
array(0) {
}
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=39744&edit=1