[Bug-wget] robots.txt not working

2012-03-17 Thread phil curb
I just tried creating a web server locally. |I tried creating a web server locally  putting robots.txt in there and using wget  and it didn't work http://pastebin.com/raw.php?i=kt1mV2af  C:\rwget 127.0.0.1:56 2012-03-16 19:45:32 (20.0 KB/s) - `index.html' saved [3/3] C:\rwget

Re: [Bug-wget] robots.txt not working

2012-03-17 Thread Micah Cowan
I think you're misunderstanding what was supposed to happen. The robots.txt file is only followed for links that wget is automatically following. This means (a) wget has to be in recursive-descent mode (-r or -m), and (b) it only applies to links that weren't explicitly requested by the user. In

Re: [Bug-wget] wget reports: libproxy suggest to use 'direct://'

2012-03-17 Thread Wilfred van Velzen
Hi, On Sat, Mar 17, 2012 at 12:04 PM, Giuseppe Scrivano gscriv...@gnu.org wrote: Wilfred van Velzen wvvel...@gmail.com writes: Link: gcc -fmessage-length=0 -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector     -funwind-tables -fasynchronous-unwind-tables -g -lproxy                              

Re: [Bug-wget] wget reports: libproxy suggest to use 'direct://'

2012-03-17 Thread Giuseppe Scrivano
Wilfred van Velzen wvvel...@gmail.com writes: These links might be usefull: http://software.opensuse.org/search/download?base=openSUSE%3A12.1file=openSUSE%3A%2F12.1%2Fstandard%2Fsrc%2Fwget-1.13.4-6.1.3.src.rpmquery=wget

Re: [Bug-wget] Some questions about wget idea

2012-03-17 Thread Micah Cowan
On 03/17/2012 09:45 AM, Boris Bobrov wrote: Hello! I've noticed the task with adding concurrency to wget and was really happy to see that wget will soon get that feature - I needed it a lot some time ago. I would also like to implement that feature. But I've got some question beforehand.

[Bug-wget] GSoC2012 - parallel downloading

2012-03-17 Thread Adam 'foo-script' Rakowski
Hi guys, I am Adam, CS student from UE. Your idea for this GSoC is very exciting. I have experience with parallel programming with OpenMP, but I need dive deeper into wget code to find out, if OMP is applicable. A problem with OMP is it can not guarantee that IO operations will be held in proper

[Bug-wget] Dead links in documentations

2012-03-17 Thread Adam 'foo-script' Rakowski
An official WGet page [Downloading GNU Wget] contains link to: http://wget.addictivecode.org/Faq#download Links to: http://ftp.gnu.org/gnu/wget/wget-latest.tar.gz (GNU.org) http://ftp.gnu.org/gnu/wget/wget-latest.tar.bz2 (GNU.org) give 404 error. Best Adam

Re: [Bug-wget] can't get wget to not download

2012-03-17 Thread Henrik Holst
Wget only obeys robots.txt when doing a full recursive download of a complete site: Wget can follow links in HTML, XHTML, and CSS pages, to create local versions of remote web sites, fully recreating the directory structure of the original site. This is sometimes referred