Edit report at https://bugs.php.net/bug.php?id=62562&edit=1
ID: 62562 Updated by: ras...@php.net Reported by: magog dot the dot ogre at gmail dot com Summary: preg_replace mangles UTF8 string - Windows only Status: Open Type: Bug Package: *Regular Expressions Operating System: Windows x86 PHP Version: 5.3.14 Block user comment: N Private report: N New Comment: Well, I have looked at the code. We take the raw binary string and pass it straight to PCRE both on Windows and UNIX. So something along the way isn't the same. But I am not a Windows guy, so I can't help you on the Windows side of things. It works fine on my Linux box here. Previous Comments: ------------------------------------------------------------------------ [2012-07-15 22:32:03] magog dot the dot ogre at gmail dot com OK then, after doing some more plugging around, it appears that it still might be a PHP issue. Correct me if I'm wrong, but here are my finding: Create a php file with only the following content: <?php echo preg_match("/\s+/", "ááá¤áá áááªáá")?"1":"0"; Running this on Windows will return "1", running on Unix returns "0". Now I've run this on PCRE, and PCRE has returned that there was no match. Thus, it may be a PHP issue. Here is the output: ***Contents of test.txt /\s+/ ááá¤áá áááªáá ááá¤áá áááªáá ***Output via Cygwin, running the Windows native pcretest.exe (redacted)@(redacted)-PC /cygdrive/c/Program Files (x86)/pcre-7.0-bin/bin $ ./pcretest.exe test.txt PCRE version 7.0 18-Dec-2006 /\s+/ ááá¤áá áááªáá No match ááá¤áá áááªáá 0: (I included the second example above with a space purposefully added, just to show that the tool is functioning properly and will catch the space when it's properly there). ------------------------------------------------------------------------ [2012-07-15 21:48:18] ras...@php.net No, PCRE is a Perl-Compatible-Regex library but it is not the code used by Perl itself. Many (most?) open source things that have regex support will use PCRE. ------------------------------------------------------------------------ [2012-07-15 19:19:03] magog dot the dot ogre at gmail dot com I have Perl itself installed; do they use PCRE? Sorry for my n00b questions. If so, I will run a test on there shortly. ------------------------------------------------------------------------ [2012-07-14 03:12:27] ras...@php.net hrm.. how about finding something else that links against pcre and runs on Windows that might be able to do a replace? Like Python perhaps? I still doubt this has anything to do with PHP. We don't mangle anything going in nor out of pcre. ------------------------------------------------------------------------ [2012-07-14 03:08:15] magog dot the dot ogre at gmail dot com pcretest doesn't actually perform replacements: it only does matches. I'm not sure how I would run pcretest on this. ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at https://bugs.php.net/bug.php?id=62562 -- Edit this bug report at https://bugs.php.net/bug.php?id=62562&edit=1