Greetings,

Stumbled across a bug yesterday reproduced in both v1.8.2 and 1.10.2.

Apparently, recursive get tries to open the file for reading after downloading, to download subsequent files. Problem is, when used with -O - to deliver to stdout, it cannot open that file, so you get the output below (note the "No such file or directory error"). In 1.10, it appears that they removed this error message, but wget still fails to recursively fetch.

I realize it seems like there wouldn't be much reason to send more than one page to stdout, but I'm feeding it all into a statistical filter to classify website data, so it doesn't really matter to the filter. Do you know of any workaround for this, other than opening the files after reading (won't scale with thousands per minute).

Thanks!

$ wget -O - -r http://www.zdziarski.com > out
--15:40:06--  http://www.zdziarski.com/
           => `-'
Resolving www.zdziarski.com... done.
Connecting to www.zdziarski.com[209.51.159.242]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 24,275 [text/html]

100%[====================================>] 24,275 163.49K/s ETA 00:00

15:40:06 (163.49 KB/s) - `-' saved [24275/24275]

www.zdziarski.com/index.html: No such file or directory

FINISHED --15:40:06--
Downloaded: 24,275 bytes in 1 files





Jonathan


Reply via email to