Hi Dale, If you have a look at 'man wget'/--page-requisites, the stuff is explained quite well. To me it looks like you are missing --level 2.
If --level 2 is not what you want. you could make your point clear by making up a small document tree as an example. Regards, Tim On Dienstag, 2. August 2016 12:38:25 CEST Dale R. Worley wrote: > I want to make a local copy of the "IANA protocol assignments" web > pages. It seems to me that this ought to be a simple use of wget in > recursive mode, and indeed, it seems like someone else must have run > into this need before. But I can't get a combination of wget options > that has the behavior I want. > > The goal is to make a local file tree that mirrors these URLs: > > http://www.iana.org/assignments/index.html > (That page should be in a file named 'index.html'.) > > every HTML page under http://www.iana.org/assignments/ that can be > reached from index.html > > page requisites for those pages, even if they aren't under > http://www.iana.org/assignments/ > > The interference comes from all the stuff under http://www.iana.org that > is not under http://www.iana.org/assignments, but which is pointed to by > the pages listed above. > > To resolve the simple problem, it appears that --page-requisites does > fetch the page requisites, even if they aren't under > http://www.iana.org/assignments/. So that part of the solution works > fine. > > But I can't figure out the right combination of options to fetch the > HTML files that I want: > > > wget --mirror --convert-links --no-parent --page-requisites > http://www.iana.org/assignments/index.html Follows links outside of > /assignments/. > > wget --mirror --convert-links --exclude-directories=/ --page-requisites > http://www.iana.org/assignments/index.html This doesn't recurse beyond > index.html. > > wget --mirror --convert-links --no-parent --page-requisites > http://www.iana.org/assignments Follows links outside of /assignments/. > > wget --mirror --convert-links --exclude-directories=/ --page-requisites > http://www.iana.org/assignments This doesn't recurse beyond index.html. > > wget --mirror --convert-links --no-parent --page-requisites > http://www.iana.org/assignments/ This doesn't recurse beyond index.html. > > wget --mirror --convert-links --exclude-directories=/ --page-requisites > http://www.iana.org/assignments/ This doesn't recurse beyond index.html. > > > I'm hoping that this is a known problem and someone can tell me the > answer without having to think about it. > > I also think the documentation could be made clearer in some places, but > that can wait. > > Dale
signature.asc
Description: This is a digitally signed message part.
