On Wednesday, August 15, 2012 10:39:21 AM UTC-5, jcordes wrote:
> I am working on an HTML file generated by WordPerfect's Publish to HTML,
> trying to get clean html but maintaining the format fairly closely. One thing
> I'd like to do is delete empty tag pairs, such as this:
> <SPAN STYLE="text-decoration: underline"></SPAN>
>
> I'm sure this must be a trivial regex problem for some but I'm apparently
> missing some key idea. I'm working under Linux, with
> VIM - Vi IMproved 7.1 (2007 May 12, compiled Oct 17 2008 18:11:28)
> (sorry, not easy to upgrade)
> I have tried search patterns along the following lines
> /<SPAN .\{-}><\/SPAN>
The problem is that even though you're using non-greedy, "non-greedy" just
means "as few as possible to make it match". E.g. this will match:
<SPAN>abcdef</SPAN><SPAN></SPAN>
because the .\{-} matches any character at all, and therefore matches very
happily on the ">abcdef</SPAN><SPAN" text.
Try this instead, explicitly asking for any character which is NOT a '>':
<SPAN[^>]*><\/SPAN>
Or this variant, also allowing line breaks:
<SPAN\_[^>]*><\/SPAN>
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php