Edit report at https://bugs.php.net/bug.php?id=62562&edit=1
ID: 62562
User updated by: magog dot the dot ogre at gmail dot com
Reported by: magog dot the dot ogre at gmail dot com
Summary: preg_replace mangles UTF8 string - Windows only
Status: Open
Type: Bug
Package: *Regular Expressions
Operating System: Windows x86
PHP Version: 5.3.14
Block user comment: N
Private report: N
New Comment:
OK then, after doing some more plugging around, it appears that it still might
be a PHP issue. Correct me if I'm wrong, but here are my finding:
Create a php file with only the following content:
<?php
echo preg_match("/\s+/", "ááá¤áá áááªáá")?"1":"0";
Running this on Windows will return "1", running on Unix returns "0".
Now I've run this on PCRE, and PCRE has returned that there was no match. Thus,
it may be a PHP issue. Here is the output:
***Contents of test.txt
/\s+/
ááá¤áá áááªáá
ááá¤áá áááªáá
***Output via Cygwin, running the Windows native pcretest.exe
(redacted)@(redacted)-PC /cygdrive/c/Program Files (x86)/pcre-7.0-bin/bin
$ ./pcretest.exe test.txt
PCRE version 7.0 18-Dec-2006
/\s+/
ááá¤áá áááªáá
No match
ááá¤áá áááªáá
0:
(I included the second example above with a space purposefully added, just to
show that the tool is functioning properly and will catch the space when it's
properly there).
Previous Comments:
------------------------------------------------------------------------
[2012-07-15 21:48:18] [email protected]
No, PCRE is a Perl-Compatible-Regex library but it is not the code used by Perl
itself. Many (most?) open source things that have regex support will use PCRE.
------------------------------------------------------------------------
[2012-07-15 19:19:03] magog dot the dot ogre at gmail dot com
I have Perl itself installed; do they use PCRE? Sorry for my n00b questions. If
so, I will run a test on there shortly.
------------------------------------------------------------------------
[2012-07-14 03:12:27] [email protected]
hrm.. how about finding something else that links against pcre and runs on
Windows that might be able to do a replace? Like Python perhaps?
I still doubt this has anything to do with PHP. We don't mangle anything going
in
nor out of pcre.
------------------------------------------------------------------------
[2012-07-14 03:08:15] magog dot the dot ogre at gmail dot com
pcretest doesn't actually perform replacements: it only does matches. I'm not
sure
how I would run pcretest on this.
------------------------------------------------------------------------
[2012-07-14 02:44:58] [email protected]
This is unlikely to be a native PHP issue. Can you perform a similar test using
the pcretest program from pcre.org? If you can reproduce it with that then it
takes PHP completely out of the picture and you would need to file it against
libpcre.
------------------------------------------------------------------------
The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
https://bugs.php.net/bug.php?id=62562
--
Edit this bug report at https://bugs.php.net/bug.php?id=62562&edit=1