Jonas Widarsson wrote:
Andrew Gaffney wrote:

Andrew Gaffney wrote:

Andrew Gaffney wrote:

I need to strip out the string ' width="51" height="20"' from about 50 HTML documents. Is there a simple way to do this with a bash/sed or perl one-liner?




Nevermind, from google'ing, I was able to fine:

perl -pi -e 's/ width="51" height="20"//' *.html



Although, there is one case this doesn't work for. In some of the HTML files, the text I'm looking to strip is split over 2 lines like:


<a href="someurl"><img src="button.gif" border="0" width="51"
        height="20"></a>

How would I strip the text in this case?

I don't know perl, but use a lot of php.
Try to write a php script using preg_replace().
That would do very well.
I guess it is even more possible in perl, but as I said, I don't know perl.
Regular expressions are cool stuff...

I've been trying to piece together a Perl regex that will work, but I can't seem to get it to. I've tried:


perl -pi -e 's/ width="51"\n        height="20"//' *.html
perl -pi -e 's/ width="51"\n\s+height="20"//' *.html
perl -pi -e 's/ width="51".*\s+height="20"//scg' *.html

and a few other variations. None of them work right.

--
Andrew Gaffney


-- [EMAIL PROTECTED] mailing list



Reply via email to