Ed Avis writes: > Tony Abou-Assaleh wrote: > >The and operator can be done as follows: > > > >grep -E '(re1.*re2)|(re2.*re1)' > > That doesn't work for all cases, for example (with -E) > > re1=^hello > re2=^\w+
I'm assuming you meant re1=hello re2=^\w+ because otherwise, requiring both of those to match is the same as requiring just ^hello to match. But it's still possible to write a regex that matches exactly when both of the revised regexes do. I believe it would look something like this: ^(hello|\w.*hello) The closure of regular expressions under conjunction doesn't guarantee that you get a short or non-repetitive regex out at the end, only that you can get some regex. > I wonder if perl5 so-called regular expressions are 'closed under > and' in this way. Or if a perl5 regexp can be used to give the > conjunction of two plain grep regexps, but not necessarily of two > perl5 regexps. I'm no mathematician, but I think that traditional BREs are closed under disjunction if you can do the equivalent of alpha-renaming on backreferences. Closure under conjunction in the presence of backrefs sounds trickier. Handling full PCRE regexes sounds trickier still; real Perl regexes are almost certainly impossible, given the existence of explicitly procedural constructs including execution of arbitrary Perl code. > As for --not, perl5 regexps do support negation, I think: the > pattern ^(?!x)$ matches all lines except those matching ^x$. Not quite; it matches the empty string. You can always understand lookaround as being a zero-width assertion that provides an additional constraint on what may match. So to understand ^(?!x)$ , first take out the lookahead, leaving just ^$ which clearly matches only the empty string. Then the (?!x) just says "and also fail if there's an x immediately after the start of the empty string we're matching", which has no interesting effect. Given lookaround, "all lines except those matching ^x$" is just ^(?!x$) which says "match the beginning of the string, except where that is immediately followed by an x and the end of the string". I think it's always mathematically possible to express "anything except regex R" using lookaround, but it can certainly be hard to write such regexes. > >Making grep do more with less is on my radar, but it is not a > >priority at the moment. There are some serious bugs that need to be > >fixed first. > > Understood. For the record, I'm strongly in favour of an option --all which would require all -e patterns to match, and being able to negate selected patterns would also be helpful. But, yes, I also understand the need to fix the existing bugs before finding exciting places for new ones to hide. :-) -- Aaron Crane
