Re: negative match pattern, again

Gerald Lai Tue, 13 Jun 2006 08:24:02 -0700

On Tue, 13 Jun 2006, Edward Wong wrote:

> It follows the general form of a negative line search for embedded
> <search>:
>
>  /^\%(.*[<limit0>.*]<search>[.*<limit1>]\)[EMAIL PROTECTED]
>
> For example, to match a line that contains "foo" but does not contain
> "bar" between "big" and "tummy":
>
>  /\%(.*big.*bar.*tummy\)[EMAIL PROTECTED]


Learn a lot more about regexp from this post. Thanks!

Sorry, I missed the ^ anchor:

   /^\%(.*big.*bar.*tummy\)[EMAIL PROTECTED]


Just a question, why it is necessary to have the ^ anchor? Isn't .*
already includes the characters start from the beginning of the line?

[snip]

No, .* does not start from the beginning of the line. If nothing is
specified before it in a search expression, it matches everywhere.

To give you a better picture, given the line:

123big bar tummy45

when you do a search

  /.*

what it actually does is try to match .* at 1, 2, 3, b, i, etc. But as
soon as it starts with 1, the whole line gets matched because of the *
quantifier (greedy zero or more). This may give you the false impression
that it began from the beginning of the line.

Likewise, doing

  /.

shows that the search expression is done for every character. The regex
engine searches, and when it finds a match, it increments the pointer
one character after the end of the current match and searches again.

Generally, it's a good idea to place something unique before the
[EMAIL PROTECTED] expression's atom, because there are a lot places where an
expression does _not_ match. And since we're dealing with line matching
here, the best unique identifier to use is the start of line anchor ^.
It's a one-character regex, a zero-width match, unique (for the current
line), intuitive (for line matching), and always exists.

So if you perform

  /\%(.*big.*bar.*tummy\)[EMAIL PROTECTED]

(without the ^ anchor)

on the line mentioned above, the regex search would look for the every
occurence where .*big.*bar.*tummy does _not_ match and try to match the
trailing .* at those locations (note the plurality).

The first occurence where the [EMAIL PROTECTED] atom does not match is at "i". 
Then .*
would match:

               end of match
               V
  ig bar tummy45
  ^
  beginning of match

See

  :help /[EMAIL PROTECTED]
  :help /^
  :help search-pattern

HTH :)
--
Gerald

Re: negative match pattern, again

Reply via email to