I should have been more clear. --span-hosts will enqueue the other files,
but it will also enqueue files from other hosts. I wish to recursively
download a website but not other sites that it links to.

Of course I could add --accept-regex / --reject-regex options to prevent
wget from wandering onto other hosts. But shouldn't the default --recursive
option simply handle cases where a www is either added or removed? Or is
there any scenario that I am missing which would cause undesirable effects
here?

On Thu, May 2, 2013 at 5:22 PM, Giuseppe Scrivano <[email protected]> wrote:

> Darshit Shah <[email protected]> writes:
>
> > When using the --recursive command with wget, there seems to be a small
> > issue with the logic that decides whether to enqueue a file to the
> > downloads list or not.
> >
> > By default wget downloads files only from the same host. However, this
> > causes a problem when the target hostname changes thus:
> > parent: gnu.org
> > target: www.gnu.org
> >
> > This issue causes wget to stop after just one download on a lot of sites.
> > I'm not sure if this exists in the older or release since I only have the
> > development version installed.
>
> does --span-hosts fix this scenario for you?
>
> Cheers,
> Giuseppe
>



-- 
Thanking You,
Darshit Shah
Research Lead, Code Innovation
Kill Code Phobia.
B.E.(Hons.) Mechanical Engineering, '14. BITS-Pilani

Reply via email to