subject:"\[Bug\-wget\] \-\-page\-requisites and robot exclusion issue"

Re: [Bug-wget] --page-requisites and robot exclusion issue

2011-12-05 Thread Paul Wratt

if it does not obey - server admins will ban it the work around: 1) get single html file first - edit out meta tag - re-get with --no-clobber (usually only in landing pages) 2) empty robots.txt (or allow all - search net) possible solutions: A) command line option B) ./configure

Re: [Bug-wget] --page-requisites and robot exclusion issue

2011-12-05 Thread markk

Hi, Paul Wratt wrote: if it does not obey - server admins will ban it the work around: 1) get single html file first - edit out meta tag - re-get with --no-clobber (usually only in landing pages) 2) empty robots.txt (or allow all - search net) possible solutions: A) command line option

Re: [Bug-wget] --page-requisites and robot exclusion issue

2011-12-05 Thread Giuseppe Scrivano

Paul Wratt paul.wr...@gmail.com writes: if it does not obey - server admins will ban it the work around: 1) get single html file first - edit out meta tag - re-get with --no-clobber (usually only in landing pages) 2) empty robots.txt (or allow all - search net) possible solutions: A)

[Bug-wget] --page-requisites and robot exclusion issue

2011-12-04 Thread markk

Hi, I'm using wget 1.13.4. There seems to be a problem with wget over-zealously obeying robot exclusion when --page-requisites is used, even when only downloading a single URL. I attempted to download a single web page, specifying --page-requisites so that images, css and javascript files