Hi Dale, I'm seeing it always redirects to www.iana.org/protocols
Would -A protocols work for you? e.g wget ----mirror --convert-links --no-parent --page-requisites -A protocols http://www.iana.org/protocols On 02/08/16 18:38, Dale R. Worley wrote: > I want to make a local copy of the "IANA protocol assignments" web > pages. It seems to me that this ought to be a simple use of wget in > recursive mode, and indeed, it seems like someone else must have run > into this need before. But I can't get a combination of wget options > that has the behavior I want. > > The goal is to make a local file tree that mirrors these URLs: > > http://www.iana.org/assignments/index.html > (That page should be in a file named 'index.html'.) > > every HTML page under http://www.iana.org/assignments/ that can be > reached from index.html > > page requisites for those pages, even if they aren't under > http://www.iana.org/assignments/ > > The interference comes from all the stuff under http://www.iana.org that > is not under http://www.iana.org/assignments, but which is pointed to by > the pages listed above. > > To resolve the simple problem, it appears that --page-requisites does > fetch the page requisites, even if they aren't under > http://www.iana.org/assignments/. So that part of the solution works > fine. > > But I can't figure out the right combination of options to fetch the > HTML files that I want: > > > wget --mirror --convert-links --no-parent --page-requisites > http://www.iana.org/assignments/index.html > Follows links outside of /assignments/. > > wget --mirror --convert-links --exclude-directories=/ --page-requisites > http://www.iana.org/assignments/index.html > This doesn't recurse beyond index.html. > > wget --mirror --convert-links --no-parent --page-requisites > http://www.iana.org/assignments > Follows links outside of /assignments/. > > wget --mirror --convert-links --exclude-directories=/ --page-requisites > http://www.iana.org/assignments > This doesn't recurse beyond index.html. > > wget --mirror --convert-links --no-parent --page-requisites > http://www.iana.org/assignments/ > This doesn't recurse beyond index.html. > > wget --mirror --convert-links --exclude-directories=/ --page-requisites > http://www.iana.org/assignments/ > This doesn't recurse beyond index.html. > > > I'm hoping that this is a known problem and someone can tell me the > answer without having to think about it. > > I also think the documentation could be made clearer in some places, but > that can wait. > > Dale >
signature.asc
Description: OpenPGP digital signature
