Andrew Gaffney wrote:
Andrew Gaffney wrote:
Andrew Gaffney wrote:
I need to strip out the string ' width="51" height="20"' from about 50 HTML documents. Is there a simple way to do this with a bash/sed or perl one-liner?
Nevermind, from google'ing, I was able to fine:
perl -pi -e 's/ width="51" height="20"//' *.html
Although, there is one case this doesn't work for. In some of the HTML files, the text I'm looking to strip is split over 2 lines like:
<a href="someurl"><img src="button.gif" border="0" width="51" height="20"></a>
How would I strip the text in this case?
I don't know perl, but use a lot of php. Try to write a php script using preg_replace(). That would do very well. I guess it is even more possible in perl, but as I said, I don't know perl. Regular expressions are cool stuff...
I've been trying to piece together a Perl regex that will work, but I can't seem to get it to. I've tried:
perl -pi -e 's/ width="51"\n height="20"//' *.html perl -pi -e 's/ width="51"\n\s+height="20"//' *.html perl -pi -e 's/ width="51".*\s+height="20"//scg' *.html
and a few other variations. None of them work right.
-- Andrew Gaffney
-- [EMAIL PROTECTED] mailing list
