Hello,

I'm using the official nutch 1.3 distribution to crawl our internal
mediawiki instance. Whenever a 404 is encountered, I get a 

> fetch of http://wiki.example.org/INTERN_WIKI:Impressum failed
> with: java.net.SocketTimeoutException: Read timed out

The page really does not exist:
> $ curl -I http://wiki.example.org/INTERN_WIKI:Impressum
> HTTP/1.1 404 Not Found

So I think the error message is misleading. Is that a bug?

-- 
Viele Grüße
Christian Weiske

Reply via email to