I believe you want -H -D gnu.org. That's what it's for. Wget doesn't know which hostnames under a domain should be allowed and which should not be (do you want images.gnu.org? git.gnu.org? lists.gnu.org?), so turns 'em all off unless you ask for them explicitly.
HTH, -mjc On Thu, May 2, 2013 at 4:52 AM, Darshit Shah <[email protected]> wrote: > I should have been more clear. --span-hosts will enqueue the other files, > but it will also enqueue files from other hosts. I wish to recursively > download a website but not other sites that it links to. > > Of course I could add --accept-regex / --reject-regex options to prevent > wget from wandering onto other hosts. But shouldn't the default --recursive > option simply handle cases where a www is either added or removed? Or is > there any scenario that I am missing which would cause undesirable effects > here? > > On Thu, May 2, 2013 at 5:22 PM, Giuseppe Scrivano <[email protected]> wrote: > >> Darshit Shah <[email protected]> writes: >> >> > When using the --recursive command with wget, there seems to be a small >> > issue with the logic that decides whether to enqueue a file to the >> > downloads list or not. >> > >> > By default wget downloads files only from the same host. However, this >> > causes a problem when the target hostname changes thus: >> > parent: gnu.org >> > target: www.gnu.org >> > >> > This issue causes wget to stop after just one download on a lot of sites. >> > I'm not sure if this exists in the older or release since I only have the >> > development version installed. >> >> does --span-hosts fix this scenario for you? >> >> Cheers, >> Giuseppe >> > > > > -- > Thanking You, > Darshit Shah > Research Lead, Code Innovation > Kill Code Phobia. > B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani
