On Thu, 28 Aug 2003, C. Church wrote:

> > if ($comment =~ /\a?<(.*?)>/) {
> >
> > }
>
> It says that if the contents of $comment match the regular expression on the
> right, execute the block following.
>
> As for the RE:
>
> /\a?<(.*?)>/
>
> the \a? means 1 or 0 alarms -- this is only superfluous because it is not
> bound, via '^', to the beginning of the string (which could mean that a line
> beginning with two BELs would not match the RE.  This could easily be
> replaced with \a* -- which is "what they said" if not "what they meant").
>
> <(.*?)>
>
> is the preferred way to capture only the contents of a 'tag' without
> capturing up to the end of the final 'tag' in the $comment string.  That is,
> it is non-greedy.
>
> RE's are, by default, greedy -- meaning that they will continue to match for
> as long as they can, and still allow the remaining RE to match true... Take,
> for example, the following RE:
>
> /<(.*)>/ :
>
> If $comment = '<foo>something</foo>', with the above RE, the value of $1
> would be 'foo>something</foo' -- because it kept matching until it didn't
> find any more '>'s.
>
> Now, by adding a question mark:
>
> /<(.*?)>/
>
> This tells the RE engine to stop as soon as it finds a match that allows the
> rest of the RE to be true... In this case, $1 would resolve to 'foo'.
>
> It could also be written as, although less eloquently:
>
> /<([^>]*)>/
>
> Which, by including a character class that said zero or more characters that
> are NOT a close-angle-brace, has the same effect on the above string.
>

I kind of doubt the the original coder wanted to match the
0 length string after the first character and before the last character
in the string '<>'. A better pattern all around would be ([^<>]+).

**** [EMAIL PROTECTED] <Carl Jolley>
**** All opinions are my own and not necessarily those of my employer ****

_______________________________________________
Perl-Win32-Users mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to