nutch-user  

Re: Question about Nutch crawling

kevin chen
Thu, 03 Jul 2008 19:16:35 -0700

Can be any number of reasons.
- disabled by robots.txt, this probably most common.
- session controlled.
- authentication.

On Wed, 2008-07-02 at 10:32 -0400, Bozhao Tan wrote:
> Hello, I do not know why Nutch can not crawl anying from some internet
> sites?
> Has anyone met this problem?
> Thanks!
> 
> NewGuyInNutch