ID:               41588
 Updated by:       [EMAIL PROTECTED]
 Reported By:      spam02 at pornel dot net
 Status:           Open
 Bug Type:         Documentation problem
 Operating System: *
 PHP Version:      6.0.0-dev (20070509)
 New Comment:

>preg_match() with 'u' modifier is supposed to use UTF-8, but this
>switch doesn't affect offset parameter, which is always in bytes.

Right, PHP is not supposed to parse the regexp to detect which
modifiers were used.
The byte/codepoint behaviour changes only in Unicode mode.


Previous Comments:
------------------------------------------------------------------------

[2007-06-04 13:08:02] spam02 at pornel dot net

(fixed php version)

------------------------------------------------------------------------

[2007-06-04 13:04:43] spam02 at pornel dot net

Description:
------------
preg_match() with 'u' modifier is supposed to use UTF-8, but this
switch doesn't affect offset parameter, which is always in bytes.

This gotcha at least deserves to be documented, although consistent
unicode support would be even better.


Reproduce code:
---------------
<?php
preg_match('/./u',urldecode('%C2%AE').'NY',$m,NULL,2);
echo $m[0];


Expected result:
----------------
Y

Actual result:
--------------
N


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=41588&edit=1

Reply via email to