On Tue, Dec 30, 2008 at 12:51:31PM -0800, Gary Kline wrote: > On Tue, Dec 30, 2008 at 09:16:23PM +0100, Roland Smith wrote: > > On Tue, Dec 30, 2008 at 11:31:14AM -0800, Gary Kline wrote: > > > The problem is that there are many, _many_ embedded > > > "<A HREF="http://whatever> Site</A> in my hundreds, or > > > thousands, or files. I only want to delete the > > > "http://<junkfoo.com>" lines, _not_ the other Href links. > > > > > > Which would be best to use, given that a backup is critical? > > > sed or perl? > > > > IMHO, perl with the -i option to do in-place editing with backups. You > > could also use the -p option to loop over files. See perlrun(1). > > > > Roland > > > All right, then is this the right syntax. In other words, do > I need the double quotes to match the "http:" string? > > perl -pi.bak -e 'print unless "/m/http:/" || eof; close ARGV if eof' *
You don't need the quotes (if the command doesn't contain anything that your shell would eat/misuse/replace). See perlop(1). This will disregard the entire line with a URI in it. Is this really what you want? Copy some of the files you want to scrub to a separate directory, and run tests to see if your script works: mkdir mytest; cp foo mytest/; cd mytest; perl -pi.bak ../scrub.pl foo diff -u foo foo.bak Roland -- R.F.Smith http://www.xs4all.nl/~rsmith/ [plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated] pgp: 1A2B 477F 9970 BA3C 2914 B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)
pgpi5VZb94nko.pgp
Description: PGP signature
