I should have been more clear. --span-hosts will enqueue the other files, but it will also enqueue files from other hosts. I wish to recursively download a website but not other sites that it links to.
Of course I could add --accept-regex / --reject-regex options to prevent wget from wandering onto other hosts. But shouldn't the default --recursive option simply handle cases where a www is either added or removed? Or is there any scenario that I am missing which would cause undesirable effects here? On Thu, May 2, 2013 at 5:22 PM, Giuseppe Scrivano <[email protected]> wrote: > Darshit Shah <[email protected]> writes: > > > When using the --recursive command with wget, there seems to be a small > > issue with the logic that decides whether to enqueue a file to the > > downloads list or not. > > > > By default wget downloads files only from the same host. However, this > > causes a problem when the target hostname changes thus: > > parent: gnu.org > > target: www.gnu.org > > > > This issue causes wget to stop after just one download on a lot of sites. > > I'm not sure if this exists in the older or release since I only have the > > development version installed. > > does --span-hosts fix this scenario for you? > > Cheers, > Giuseppe > -- Thanking You, Darshit Shah Research Lead, Code Innovation Kill Code Phobia. B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani
