in them

Micah Cowan Wed, 19 Nov 2008 20:55:29 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Brian wrote:
> I would like to follow all the urls on a site that contain /res/ in the
> path. I've tried using -I and -A, with values such as res, *res*,
> */res/*, etc.. Here is an example that downloads pretty much the entire
> site, rather than what I appear  (to me) to have specified:
> 
> wget -O- -q http://img.site.org/b/imgboard.html | wget -q -r -l1 -O- -I
> '*res*' -A '*res*' --force-html -B http://img.site.org/b/ -i-
> 
> The urls I would like to follow and output to the command line are of
> the form:
> 
> http://img.site.org/b/res/97867797.html


- -A isn't useful here: it's applied only against the "filename" portion
of the URL.

- -I is what you want; the trouble is that the * wildcard doesn't match
slashes (there's plans to introduce a ** wildcard, probably in 1.13). So
unfortunately you gotta do -I'res,*/res,*/*/res' etc as needed.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
GNU Maintainer: wget, screen, teseq
http://micah.cowan.name/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkkk7awACgkQ7M8hyUobTrG2wgCeMUN3EnnY2VsmNzQTWOleZKqg
ZQYAn1CYoQ7JVc4OYfwLzcPVkai93UQc
=3I6Z
-----END PGP SIGNATURE-----

Re: Only follow paths with /res/ in them

Reply via email to