Re: [PHP] REGEX: grouping of alternative patterns
On 30/10/2007, Stijn Verholen [EMAIL PROTECTED] wrote: Hey list, I'm having problems with grouped alternative patterns. The regex I would like to use, is the following: /\s*(`?.+`?)\s*int\s*(\(([0-9]+)\))?\s*(unsigned)?\s*(((auto_increment)?\s*(primary\s*key)?)|((not\s*null)?\s*(default\s*(`.*`|[0-9]*)?)?))\s*/i It matches this statement: `id` INT(11) UNSIGNED AUTO_INCREMENT PRIMARY KEY But not this: `test4` INT(11) UNSIGNED NOT NULL DEFAULT 5 However, if I switch the alternatives, the first statement doesn't match, but the second does. FYI: In both cases, the column name and data type are matched, as expected. It appears to be doing lazy evaluation on the pattern, even though every resource I can find states that every alternative is tried in turn until a match is found. It's not lazy. Given alternate matching subpatterns, the pcre engine choses the leftmost pattern, not the longest. For instance: ?php preg_match(/a|ab/, abbot, $matches); print_r($matches); ? Array ( [0] = a ) This isn't what you'd expect if you were familiar with POSIX regular expressions, but matches Perl's behaviour. Because each of your subpatterns can match an empty string, the lefthand subpattern always matches and the righthand subpattern might as well not be there. The simplest solution, if you don't want to completely rethink your regexp might be to replace \s with [[:space:]], remove the delimiters and the i modifier and just use eregi(). like so: $pattern = '[[:space:]]*(`?.+`?)[[:space:]]*int[[:space:]]*(\(([0-9]+)\))?[[:space:]]*(unsigned)?[[:space:]]*(((auto_increment)?[[:space:]]*(primary[[:space:]]*key)?)|((not[[:space:]]*null)?[[:space:]]*(default[[:space:]]*(`.*`|[0-9]*)?)?))[[:space:]]*'; eregi($pattern, $column1, $matches); print_r($matches); // match eregi($pattern, $column2, $matches); print_r($matches); // match -robin -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
RE: [PHP] REGEX: grouping of alternative patterns
On 30 October 2007 11:07, Stijn Verholen wrote: Hey list, I'm having problems with grouped alternative patterns. The regex I would like to use, is the following: /\s*(`?.+`?)\s*int\s*(\(([0-9]+)\))?\s*(unsigned)?\s*(((auto_i ncrement)?\s*(primary\s*key)?)|((not\s*null)?\s*(default\s*(`. *`|[0-9]*)?)?))\s*/i Since all the parts beyond the id and datatype are optional, I don't see how this can ever not match. Please define more accurately what you mean by doesn't match. Cheers! Mike - Mike Ford, Electronic Information Services Adviser, JG125, The Headingley Library, James Graham Building, Leeds Metropolitan University, Headingley Campus, LEEDS, LS6 3QS, United Kingdom Email: [EMAIL PROTECTED] Tel: +44 113 812 4730 Fax: +44 113 812 3211 To view the terms under which this email is distributed, please go to http://disclaimer.leedsmet.ac.uk/email.htm -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] REGEX: grouping of alternative patterns [SOLVED]
Robin Vickery schreef: [snip] Because each of your subpatterns can match an empty string, the lefthand subpattern always matches and the righthand subpattern might as well not be there. Indeed they do, i did not realise that. The simplest solution, if you don't want to completely rethink your regexp might be to replace \s with [[:space:]], remove the delimiters and the i modifier and just use eregi(). like so: Because this is for a proof-of-concept application that will only be used by me for the time being, and because I always give a default value when specifying 'not null', I'm going to use the following, and think about a more general solution if and when the need arises. This matches both my cases: /\s*(`?.+`?)\s*int\s*(\(([0-9]+)\))?\s*(unsigned)?\s*(((not\s*null)\s*(default\s*(`.*`|[0-9]*)?))|((auto_increment)?\s*(primary\s*key)?))\s*/i [snip] -robin Thanks for your insights, Robin ! Greetz, Stijn -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php