ID: 13439 Updated by: [EMAIL PROTECTED] Reported By: [EMAIL PROTECTED] Status: Open -Bug Type: Regexps related +Bug Type: Documentation problem Operating System: linux 2.2.19 PHP Version: 4.2.1 New Comment:
So it goes... Previous Comments: ------------------------------------------------------------------------ [2002-06-08 13:13:22] [EMAIL PROTECTED] AFAIK, POSIX re's have no concept of non-greedy quantifiers. All quantifiers are greedy. But even supposing that this were a possibility i.e. the * quntifier is non-greedy, the following needs explaining: 1) Why the difference between + and * ? 2) ereg_replace('.*','b','aa') should produce an _infinite_ string of b's because the minimum match is the empty string and the global ('g' modified) behavior means that there are an infinity of minimum matches available. 3) ereg_replace('.*c','b','aac') should produce 'ab' but it doesn't, it produces just 'b'. In other words, the addition of the 'c' following the * quantifier has changed the behaviour from non-greedy to greedy! There is no consistent explanation available for this behaviour. The behaviour is just plain wrong! ------------------------------------------------------------------------ [2002-06-08 08:32:59] [EMAIL PROTECTED] I think this can be documentation issue - or a bug. Remember that ereg(i)_replace functions are always 'g' modified. So that would make it the issue, whether * is greedy. By nature + is greedy (cause otherwise the 'or more' wouldn't make sence), but if it's considered, that * isn't greedy, than what php does is perfectly correct as each 'a' matches .*. It would at least be consistent if they we're both greedy as .*? isn't supported and you can't turn of the global behavior. ------------------------------------------------------------------------ [2002-06-08 07:50:27] [EMAIL PROTECTED] I'm running linux 2.2.19. I've also tested php 4.0.4pl1 running on linux 2.4.2 and the behaviour is the same. ------------------------------------------------------------------------ [2002-06-07 16:48:57] [EMAIL PROTECTED] Updated the version..which OS are you running PHP on? ------------------------------------------------------------------------ [2002-06-07 14:48:36] [EMAIL PROTECTED] Derick! This _is_ a bug! I've now upgraded to 4.2.1 and the behaviour remains the same. Consider just these two: ereg_replace('.+','b','aa') -> b ereg_replace('.*','b','aa') -> bb and this from man 7 regex: "An atom followed by `*' matches a sequence of 0 or more matches of the atom. An atom followed by `+' matches a sequence of 1 or more matches of the atom." How can it NOT be a bug? ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/13439 -- Edit this bug report at http://bugs.php.net/?id=13439&edit=1 -- PHP Documentation Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php