Edit report at http://bugs.php.net/bug.php?id=51663&edit=1
ID: 51663
Comment by: jordi dot salvat dot i dot alabart at gmail dot com
Reported by: jordi dot salvat dot i dot alabart at gmail dot com
Summary: PHP preg_match doesn't match string which should match
Status: Open
Type: Bug
Package: PCRE related
Operating System: Ubuntu
PHP Version: 5.3.2
New Comment:
I've been able to simplify the example to:
<?= preg_match("/(.+)+:/", "a:bbbbbbbbbbbbb") ? "pass" : "fail" ?>
(I've checked this simplified form fails in PHP 5.2.10-2ubuntu6.4;
checking it in 5.3.2 too is left as an exercise for the reader).
Previous Comments:
------------------------------------------------------------------------
[2010-04-26 00:37:02] jordi dot salvat dot i dot alabart at gmail dot
com
Description:
------------
This regular expression:
/^(?:[^\[\]{}']+|'[^']*')+:(?:[^\[\]{}']+|'[^']*')+$/
matches this string: a:bbbbbbbbbbbbbbb
in Perl (5.10.0-24ubuntu4):
perl <<__END__
print 'a:bbbbbbbbbbbbb' =~
q/^(?:[^\[\]{}']+|'[^']*')+:(?:[^\[\]{}']+|'[^']*')+$/;
print "\n";
__END__
1
and pcretest (libpcre3 7.8-3):
pcretest <<__END__
/^(?:[^\[\]{}']+|'[^']*')+:(?:[^\[\]{}']+|'[^']*')+$/
a:bbbbbbbbbbbbbbb
__END__
PCRE version 7.8 2008-09-05
re> data> 0: a:bbbbbbbbbbbbbbb
data>
Not, however, in PHP (5.3.2):
$ ./php --version
PHP 5.3.2 (cli) (built: Apr 25 2010 23:58:33)
Copyright (c) 1997-2010 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2010 Zend Technologies
$ ./php <<__END__
<?= preg_match("/^(?:[^\[\]{}']+|'[^']*')+:(?:[^\[\]{}']+|'[^']*')+$/",
"a:bbbbbbbbbbbbb") ?>
__END__
0
The bug is pretty sensible to changes in the input. Removing a couple of
"b"s makes it match. I don't know which aspects of the regexp cause it
to fail.
For confirmation that this is indeed a bug without having to decypher
the regexp, here's proof:
<?php
$A='(?:[^\[\]{}\']+|\'[^\']*\')+';
$a= 'a';
$B=":$A";
$b= ':bbbbbbbbbbbbbbb';
print_r(preg_match("/^$A$/", "$a"));
print_r(preg_match("/^$B$/", "$b"));
print_r(preg_match("/^$A$B$/", "$a$b"));
print_r("\n");
This outputs "110", which is impossible since if /^$A$/ matches "$a" and
/^$B$/ matches "$b", /^$A$B$/ should definitely match "$a$b".
Test script:
---------------
<?= preg_match("/^(?:[^\[\]{}']+|'[^']*')+:(?:[^\[\]{}']+|'[^']*')+$/",
"a:bbbbbbbbbbbbb") ? "pass" : "fail" ?>
Expected result:
----------------
pass
Actual result:
--------------
fail
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/bug.php?id=51663&edit=1