if it does not obey - server admins will ban it the work around: 1) get single html file first - edit out meta tag - re-get with --no-clobber (usually only in landing pages) 2) empty robots.txt (or allow all - search net)
possible solutions: A) command line option B) ./configure --disable-robots-check Paul On Mon, Dec 5, 2011 at 10:33 AM, Giuseppe Scrivano <[email protected]> wrote: > [email protected] writes: > >> But in cases where you *are* recursively downloading and using >> --page-requisites, it would be polite to otherwise obey the robots > 5B> exclusion standard by default. Which you can't do if you have to use -e >> robots=off to ensure all requisites are downloaded. > > it seems a good idea to handle -r and --page-requisites in this case, > wget shouldn't obbey the robots exclusion directives. > > Thanks, > Giuseppe >
