I want to download a few sites, and have some questions about the best way to do it...
I'll be doing a recursive-download to infinity, but limited to the current directory downwards (-np No Parent). I'll also download the page requirements (-p). wget -r -l inf -np -p http://domain.name/index.html (I'll also be adding to limit-rate and a bit of pause between each download.) My problem is that I want to have all page requirements, also if they span hosts, or are located above/parallell to the site (directory) I'll be working in (it's on GeoCities, so there are many "parallell" sites). As long as it's "part of" a page in the directory I'm working in it should be downloaded, but not else. An additional problem, is that there may be lists of links that actually points to those directories/hosts; but nothing should be downloaded unless it's part of a page. Would this be possible (at least partially, I understand if it's a problem getting around the no-parent)? -- Do not do today, what you can get others to do tomorrow.
