[Bug-wget] Recursive download: Page requirements when spanning hosts?

bakopper Fri, 23 Oct 2009 05:16:56 -0700

I want to download a few sites, and have some
questions about the best way to do it...


I'll be doing a recursive-download to infinity,
but limited to the current directory downwards
(-np No Parent).  I'll also download the page
requirements (-p).

wget -r -l inf -np -p http://domain.name/index.html
(I'll also be adding to limit-rate and a bit of
pause between each download.)

My problem is that I want to have all page requirements,
also if they span hosts, or are located above/parallell
to the site (directory) I'll be working in (it's on
GeoCities, so there are many "parallell" sites).  As long
as it's "part of" a page in the directory I'm working in it
should be downloaded, but not else.  An additional problem,
is that there may be lists of links that actually points to
those directories/hosts; but nothing should be downloaded
unless it's part of a page.

Would this be possible (at least partially, I understand
if it's a problem getting around the no-parent)?

-- 
Do not do today, what you can
get others to do tomorrow.

[Bug-wget] Recursive download: Page requirements when spanning hosts?

Reply via email to