ID:               39951
 Updated by:       [EMAIL PROTECTED]
 Reported By:      imacat at mail dot imacat dot idv dot tw
-Status:           Closed
+Status:           Bogus
 Bug Type:         PCRE related
 Operating System: Linux 2.6.16.29
 PHP Version:      5.2.0


Previous Comments:
------------------------------------------------------------------------

[2006-12-26 15:06:24] imacat at mail dot imacat dot idv dot tw

Well.... This must be some kind of blind spot.  I spent a lot of time
finding out the configuration setting, but have never thought of
altering it. 

It seems to solve the problem.  I'm terribly sorry for the bothering.

------------------------------------------------------------------------

[2006-12-26 09:48:59] [EMAIL PROTECTED]

http://php.net/pcre

Table 1. PCRE Configuration OptionsName Default Changeable      Changelog
pcre.backtrack_limit    100000  PHP_INI_ALL     Available since PHP 5.2.0.
pcre.recursion_limit    100000  PHP_INI_ALL     Available since PHP 5.2.0.

------------------------------------------------------------------------

[2006-12-26 07:24:08] imacat at mail dot imacat dot idv dot tw

I was wrong. ^^; sorry.  The Expected Result is:

=== Test #01 Non-UTF-8
1. "a" repeated 49997 times
   strlen():  49997, preg_match(): 1
2. "a" repeated 49998 times
   strlen():  49998, preg_match(): 1
=== Test #02 UTF-8
1. "\xE4\xB8\x80" repeated 49997 times
   strlen(): 149991, preg_match(): 1
2. "\xE4\xB8\x80" repeated 49998 times
   strlen(): 149994, preg_match(): 1

And the Actual Result is:

=== Test #01 Non-UTF-8
1. "a" repeated 49997 times
   strlen():  49997, preg_match(): 1
2. "a" repeated 49998 times
   strlen():  49998, preg_match(): 0
=== Test #02 UTF-8
1. "\xE4\xB8\x80" repeated 49997 times
   strlen(): 149991, preg_match(): 1
2. "\xE4\xB8\x80" repeated 49998 times
   strlen(): 149994, preg_match(): 0

------------------------------------------------------------------------

[2006-12-26 07:20:13] imacat at mail dot imacat dot idv dot tw

Description:
------------
    Hi.  This is imacat from Taiwan.  I experienced PCRE failure after
long matches.  It doesn't seems to pass the 50000 match limit, UTF-8 or
not.  However, looking into the included PCRE library directory I saw no
such limit anywhere.    In config0.m4 the setting is
-DMATCH_LIMIT=10000000.  In pcrelib/README it states that the default
of --with-match-limit is 500000.  In php.ini I saw
pcre.backtrack_limit=100000.  Whatever I saw are far less than the
50000 match limit.

    This is hard to me since I have several articles to be parsed
that's of size over 50000 bytes/characters.

Reproduce code:
---------------
#! /usr/bin/php
<?php
echo "=== Test #01 Non-UTF-8\n";
$a = str_repeat("a", 49997);
echo "1. \"a\" repeated 49997 times\n";
printf("   strlen(): %6d, preg_match(): %d\n",
    strlen($a), preg_match("/^(.*?)\s*$/us", $a));
$a = str_repeat("a", 49998);
echo "2. \"a\" repeated 49998 times\n";
printf("   strlen(): %6d, preg_match(): %d\n",
    strlen($a), preg_match("/^(.*?)\s*$/us", $a));
echo "=== Test #02 UTF-8\n";
$a = str_repeat("\xE4\xB8\x80", 49997);
echo "1. \"\\xE4\\xB8\\x80\" repeated 49997 times\n";
printf("   strlen(): %6d, preg_match(): %d\n",
    strlen($a), preg_match("/^(.*?)\s*$/us", $a));
$a = str_repeat("\xE4\xB8\x80", 49998);
echo "2. \"\\xE4\\xB8\\x80\" repeated 49998 times\n";
printf("   strlen(): %6d, preg_match(): %d\n",
    strlen($a), preg_match("/^(.*?)\s*$/us", $a));
?>


Expected result:
----------------
=== Test #01 Non-UTF-8
1. "a" repeated 49997 times
   strlen():  49997, preg_match(): 1
2. "a" repeated 49998 times
   strlen():  49998, preg_match(): 0
=== Test #02 UTF-8
1. "\xE4\xB8\x80" repeated 49997 times
   strlen(): 149991, preg_match(): 1
2. "\xE4\xB8\x80" repeated 49998 times
   strlen(): 149994, preg_match(): 0


Actual result:
--------------
=== Test #01 Non-UTF-8
1. "a" repeated 49997 times
   strlen():  49997, preg_match(): 1
2. "a" repeated 49998 times
   strlen():  49998, preg_match(): 1
=== Test #02 UTF-8
1. "\xE4\xB8\x80" repeated 49997 times
   strlen(): 149991, preg_match(): 1
2. "\xE4\xB8\x80" repeated 49998 times
   strlen(): 149994, preg_match(): 1



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=39951&edit=1

Reply via email to