On 15 Oct 2007, at 3:40 PM, Ronald J Kimball wrote:
On Mon, Oct 15, 2007 at 11:58:12AM -0700, Greg V. Raven wrote:
OK, I'm stumped. I'm attempting to come up with a GREP pattern that
will find empty HTML tags. In building up to the full pattern, I've
found that this matches the start tag and the white space after it:
<([a-zA-Z]+) *.*?>\s*
The normal pattern for a closing tag seems to be:
</[a-zA-Z]+>
Given that I've captured the opening tag, it seems to me that the
pattern for the closing tag in my overall pattern should be:
<([a-zA-Z]+) ?.*?>\s*</\1>
However, while this pattern finds some empty tags, if I have nested
tags (empty or full), it finds the entire tag string, which is not
correct.
Any thoughts on what I'm missing?
Even though .*? is non-greedy, it can still match across a tag. I
think
you want something like this instead:
<([a-zA-Z]+)[^>]*>\s*</\1>
This had confused me until I thought it through. Maybe the archives
will benefit from how I thought it out (or I can benefit from someone
clarifying it for me).
The following two regular expressions
.*?>
and
[^>]*>
are equivalent, as far as they go. The one matches zero-or-more
characters, stopping the match at the pattern (>) following the
repetition, and then matches >. The other matches zero-or-more
characters that only aren't >, and then matches >.
The following two regular expressions
.*?></
and
[^>]*></
are not equivalent. The second does what you'd expect, matching zero-
or-more non->s, then matching >, then <, then /.
The first one matches zero-or-more characters, NOT up to the next >,
but UP TO THE NEXT ></. The whole remainder of the pattern has to
match before it is determined which sequence of zero-or-more
characters will be chosen as the prefix.
So
<([a-zA-Z]+) ?.*?>\s*</\1>
matches from the beginning of a tag, through the sequence >\s*</\1>,
and it doesn't matter how many >s intervene between the tag and that
suffix.
— F
--
------------------------------------------------------------------
Have a feature request? Not sure the software's working correctly?
If so, please send mail to <[EMAIL PROTECTED]>, not to the list.
List FAQ: <http://www.barebones.com/support/lists/bbedit_talk.shtml>
List archives: <http://www.listsearch.com/BBEditTalk.lasso>
To unsubscribe, send mail to: <[EMAIL PROTECTED]>