ID: 46947 Updated by: [email protected] Reported By: victor at casnt dot ro Status: Bogus Bug Type: PCRE related Operating System: Debian PHP Version: 5.2.8 New Comment:
If you write bad regular expressions you can consume all the stack space and crash the process. Therefore there are recursion and backtrack limit. http://www.php.net/manual/en/pcre.configuration.php Previous Comments: ------------------------------------------------------------------------ [2008-12-27 16:08:48] victor at casnt dot ro I have read the docs and I see that preg_match does not work with big strings. In this case I would like transform this ticket from a bug report to a request for new functionalities. It will also be good for the competition(perl works with big strings). I need to parse a big xml file. Using the php xml functions in my case is a complication. Pattern matching can solve my problem in just 3 lines. When I say "big string" I mean 100KB(my test failed on a 104KB file). Pattern matching will clearly be inneficient on huge stings compared to a dedicated parser. Thank you. ------------------------------------------------------------------------ [2008-12-26 20:02:12] [email protected] Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.php.net/manual/ and the instructions on how to report a bug at http://bugs.php.net/how-to-report.php . ------------------------------------------------------------------------ [2008-12-26 19:54:10] [email protected] Hello, the dot character only match a newline in a string when using the 's' modifier. Try: /(<\?xml.+<report[^>]+>)/s PS: You don't need escape < > in this case (that it's not a separator (the / in this case)) ------------------------------------------------------------------------ [2008-12-26 19:16:53] victor at casnt dot ro Description: ------------ The exact same pattern matching is not working on the same string, with a extra line at the end. The length of the string seems to be the problem. Reproduce code: --------------- The script: $handle = fopen('test3.xml', "r"); $contents = fread($handle, filesize('test3.xml')); fclose($handle); $contents = preg_replace('/\>[\t|\ |\s]{1,}\</', "><", $contents); $contents = preg_replace('/\n/', "", $contents); if(preg_match('/(\<\?xml.{1,}\<report[^>]{1,}\>)/', $contents, $match)) { print "Match: $match[1]\n"; } else {print "Fail\n";} OBSERVATION: If you delete the last line from the file test3.xml, the pattern matching will work fine. The last lines have nothing to do with the pattern. You can find the file here: http://www.casnt.ro/Files/XML/test3.xml Expected result: ---------------- Match: <?xml version="1.0" encoding="utf-8"?><report AppKey="CNAS-v1.0.2801.665" AppID="14" clinic="SSSSSSSSSSSSSSSSSSSSSS" fiscalCode="11111111" contractNo="123123123/2008" insuranceHouse="CAS-NT" reportingDate="2008-12-07" startFrom="2008-11-01" endTo="2008-11-30" invoiceNo="" labValue="0" hspValue="0" xmlns="http://localhost"> Actual result: -------------- Fail ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=46947&edit=1
