ID: 40871
User updated by: ismith at motorola dot com
Reported By: ismith at motorola dot com
Status: Bogus
Bug Type: PCRE related
Operating System: Windows Server 2003 SP1
PHP Version: 5.2.1
New Comment:
Tony, thanks for the response... but more info would be good. Where do
I report this? How do I get it fixed?
Previous Comments:
------------------------------------------------------------------------
[2007-03-20 20:00:17] ismith at motorola dot com
BTW, this bug surfaced in MediaWiki 1.9.3 on a private wiki, where it
causes some pages with pasted-in Windows quotes to be displayed as
blank.
------------------------------------------------------------------------
[2007-03-20 19:58:25] [EMAIL PROTECTED]
This is what the underlying PCRE library returns.
------------------------------------------------------------------------
[2007-03-20 19:54:33] ismith at motorola dot com
Description:
------------
I am using preg_replace to do a search and replace on some text which
contains an invalid UTF-8 code sequence. I am using the "u" modifier.
I believe that preg_replace should suppress the bad character, or
replace it with an appropriate error marker; but otherwise return the
text intact (after making the required replacements).
Both preg_replace and preg_replace_callback return an empty string in
this case, even when the search pattern matches nothing in the input.
Reproduce code:
---------------
<?php
// Text with a valid UTF-8 character sequence.
$goodText = "I hate WOMBATS \342\200\234 and COWS";
// Text with an invalid UTF-8 character sequence.
$badText = "I love BEARS \342\200\077 and LIONS";
$good2 = preg_replace("/ELEPHANTS/iu", "MICE", $goodText);
printf("Was \"%s\"; now \"%s\"\n", $goodText, $good2);
$bad2 = preg_replace("/ELEPHANTS/iu", "MICE", $badText);
printf("Was \"%s\"; now \"%s\"\n", $badText, $bad2);
?>
Expected result:
----------------
Was "I hate WOMBATS ΓÇ£ and COWS"; now "I hate WOMBATS ΓÇ£
and COWS"
Was "I love BEARS ΓÇ? and LIONS"; now "I love BEARS ΓÇ? and
LIONS"
Actual result:
--------------
Was "I hate WOMBATS ΓÇ£ and COWS"; now "I hate WOMBATS ΓÇ£
and COWS"
Was "I love BEARS ΓÇ? and LIONS"; now ""
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=40871&edit=1