Hamish wrote:

> upon relflection, the greediness of regex would make the original
> '[^<]*' match until the last < on the line, not the one next found.
> (??????: to stop at the next found you would use '[^<]*?')

No; [^<] won't match a < regardless of whether the repetition is
greedy (*) or non-greedy (*?).

Non-greedy repetitions are only needed when the base pattern can also
match whatever follows the repetition. In that case, greedy
repetitions prefer to continue (matching the character(s) as part of
the repetition), while non-greedy repetitions prefer to terminate
(matching the character(s) against the following expression).

E.g. for the string aaabbb, (.*)(b+) will match with \1=aaabb,\2=b,
while (.*?)(b+) will match with \1=aaa,\2=bbb.

Also, note that non-greedy repetitions aren't portable. They exist in
PCRE, and some other regex implementations (e.g. [X]Emacs) have them. 
I don't think that they're supported by the GNU libc functions
(regcomp, regexec), and they certainly aren't specified in POSIX.

For sed, POSIX doesn't even specify \? and \+; those are GNU
extensions.

-- 
Glynn Clements <[EMAIL PROTECTED]>
_______________________________________________
grass-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/grass-dev

Reply via email to