I've tried some further experiments.  One thing I realized was that the
"protocol" directory contains only two files, one of which I wanted, so
I could get very close to the ideal with

$ wget -r --include-directories=/assignments,/protocols 
http://www.iana.org/protocols/index.html

Unfortunately, the web site is constructed to confound that, because
protocols/index.html redirects *also* -- to
http://www.iana.org/protocols!  (Despite that that's the name of a
directory also.)  There's no way to retrieve the root HTML file without
wget considering its "directory" to be "http://www.iana.org/";.

So there's no nice solution without either revising the web site or
changing wget's behavior.

Dale

Reply via email to