It may be useful to add a paragraph to the manual which lets users know
they can use the --debug option to see why certain URLs are not followed
(rejected) by wget. It would be especially useful to mention this in
"9.1 Robot Exclusion". Something like this:
If you wish to see which URLs are blocked by the robots.txt while wget
is crawling, use the --debug option. You will see 2 lines that describe
why the URL is being rejected:
Rejecting path /abc/bar.html because of rule `/abc'.
Not following http://foo.org/abc/bar.html because robots.txt forbids it.
Thanks,
Frank