Edit report at http://bugs.php.net/bug.php?id=51777&edit=1
ID: 51777 User updated by: trevor at ridgebizdev dot com Reported by: trevor at ridgebizdev dot com Summary: RegEx matching fails Status: Bogus Type: Feature/Change Request Package: PCRE related Operating System: Windows XP PHP Version: 5.3.2 New Comment: preg_last_error() is stellar. Why is it not used properly by the preg_ functions? Previous Comments: ------------------------------------------------------------------------ [2010-05-17 08:22:26] m...@php.net http://php.net/preg_last_error ------------------------------------------------------------------------ [2010-05-13 04:19:36] trevor at ridgebizdev dot com Setting the pcre.backtrack_limit makes 51777 an installation feature request; however, because no warning is fired, the report remains a bug. ------------------------------------------------------------------------ [2010-05-12 20:10:54] trevor at ridgebizdev dot com [Pcre] ;PCRE library backtracking limit. ; http://php.net/pcre.backtrack-limit ;pcre.backtrack_limit=100000 ;PCRE library recursion limit. ;Please note that if you set this value to a high number you may consume all ;the available process stack and eventually crash PHP (due to reaching the ;stack size limit imposed by the Operating System). ; http://php.net/pcre.recursion-limit ;pcre.recursion_limit=100000 I have nothing set (I must be using the default). I use Windows XP Professional with 4GB RAM and Core2Duo. ------------------------------------------------------------------------ [2010-05-12 09:14:13] m...@php.net Please try using this snapshot: http://snaps.php.net/php5.3-latest.tar.gz For Windows: http://windows.php.net/snapshots/ Works here. Do you have a pcre.backtrack_limit set? ------------------------------------------------------------------------ [2010-05-09 18:36:54] trevor at ridgebizdev dot com Description: ------------ When a RegEx "looks" over ~32768 times during a successful match, every RegEx function fails and returns the empty string. Test script: --------------- <?php $response = http_get("http://www.travelocity.com"); // no problem with these first 2 RegExs $response2 = preg_replace('/\s+/'," ",$response); $mytitle = preg_replace('/.*?<\s*title\s*>([^<]*)<.*/i','${1}',$response2); echo "\nTitle Match Forward: ".$mytitle."\n\n"; // now, here's a problem $mytitle2 = preg_replace('/.*<\s*title\s*>([^<]*)<.*/i','${1}',$response2); echo "\nTitle Match Backward: ".$mytitle2."\n\n"; ?> Expected result: ---------------- $mytitle gets extracted properly and echoed because the RegEx never looks more than 32768 times starting at the beginning of the travelocity.com page source. $mytitle2 never gets extracted because the RegEx looks more than 32768 times successfully and preg_replace() crashes into the empty string. Matching forward for the title is working; matching backward for the title is failing for large buffers. Actual result: -------------- Title Match Forward: Travelocity Travel: Airline Tickets, Hotels, Flights, Vacations, Cruises & Car Rentals Title Match Backward: ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/bug.php?id=51777&edit=1