I've tried some further experiments. One thing I realized was that the "protocol" directory contains only two files, one of which I wanted, so I could get very close to the ideal with
$ wget -r --include-directories=/assignments,/protocols http://www.iana.org/protocols/index.html Unfortunately, the web site is constructed to confound that, because protocols/index.html redirects *also* -- to http://www.iana.org/protocols! (Despite that that's the name of a directory also.) There's no way to retrieve the root HTML file without wget considering its "directory" to be "http://www.iana.org/". So there's no nice solution without either revising the web site or changing wget's behavior. Dale
