ID: 49568 User updated by: anoop dot john at zyxware dot com Reported By: anoop dot john at zyxware dot com Status: Bogus Bug Type: PCRE related Operating System: Ubuntu Jaunty PHP Version: 5.2.10 New Comment:
I am sorry but by adding a ? to the end of the pattern I would make the closing brace an optional match and the regex would match the content of the first braces till it stops matching and the content of the second brace completely including the closing brace. But the point is to not match the content of the first set of braces at all. The following is the results from the suggested change. The matches array now contain the partial match from the content of the first brace as matches[0] and the full match from the second brace as matches[1]. This is incorrect. The contents of the first pair of braces should not be matched at all. array ( 0 => array ( 0 => '(Euronext, NASDAQ: CRXL; AMEX,NYSE,NASDAQ,Swiss Exchange: CRX', 1 => 'Euronext, NASDAQ', 2 => 'CRXL', 3 => ' AMEX,NYSE,NASDAQ,Swiss Exchange', 4 => 'CRX', ), 1 => array ( 0 => '(AMEX,NYSE, Swiss Exchange:CRX;Nasdaq: QTWW)', 1 => 'AMEX,NYSE, Swiss Exchange', 2 => 'CRX', 3 => 'Nasdaq', 4 => 'QTWW', ), ) array ( 0 => array ( 0 => '(AMEX,NYSE, Swiss Exchange: CRX;Nasdaq:QTWW)', 1 => 'AMEX,NYSE, Swiss Exchange', 2 => 'CRX', 3 => 'Nasdaq', 4 => 'QTWW', ), ) To put you in context. The regex does this. Match two sets of combinations of one of the words from AMEX|NASDAQ|NasdaqGM|NasdaqGS|NYSE and any number of (words or groups of words separated by spaces) separated by commas paired with a stock ticker in full caps and separted from exchange name by : and both combinations enclosed within one brace and separated by ; and remember 1) Combination of exchange names of first stock 2) First stock name 3) Combination of exchange names of second stock 4) Second stock name Previous Comments: ------------------------------------------------------------------------ [2009-09-18 18:42:41] j...@php.net Well, it isn't a bug. Your pattern just doesn't work properly. Try adding '?' in the end of it.. See also: http://php.net/manual/en/regexp.reference.meta.php ------------------------------------------------------------------------ [2009-09-18 18:13:36] anoop dot john at zyxware dot com Oh no I don't have a big issue with the bug as far as my application's needs are concerned. The example was only a use case I tried while testing the regex. I reported the bug (if it is indeed one) so that you can fix it (if it is worth fixing) for everybody's sake :-) ------------------------------------------------------------------------ [2009-09-18 17:55:52] j...@php.net How about fixing your pattern to match 1 or more times? Now it only matches if there's exactly one match. ------------------------------------------------------------------------ [2009-09-18 14:25:52] anoop dot john at zyxware dot com I tried taking out conditions from the regular expressions but when I took out the first condition the expression starts giving the expected result. So the symptom appears only for the specific expression and the specific text. My logic about the issue seems to be OK. If pattern \(P\) matches (A) returns (A) as matches array \(P\) does not match (B) where no part of P can match \( or \) then \(P\) should definitely match (B)(A) and return (A) in the matches array ------------------------------------------------------------------------ [2009-09-18 13:46:51] j...@php.net Please, simplify the regex to as much as possible. Once you have the simplest case still showing the problem we might be able to say whether it's a bug or not. ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/49568 -- Edit this bug report at http://bugs.php.net/?id=49568&edit=1