It just occurred to me that since wget will perform this task properly if it gets the rule from robots.txt, maybe this issue could be worked around by proxying or spoofing the remote site's robots.txt file locally? That is, I write
User-agent: * Disallow: wgettest/links2.html into a file, save it in my home directory, and then somehow tell wget that davidskalinder.com/robots.txt is actually located at /home/user/robots.txt? Does anybody know a convenient way of doing this? Or is there an easier workaround I'm overlooking?
