Can you check the log file for more info?

default location: $NUTCH_HOME/logs/hadoop.log

Ref:
http://www.opensourceconnections.com/blog/2014/05/24/crawling-with-nutch/


On Fri, Jul 18, 2014 at 8:52 PM, Ankur Dulwani <[email protected]>
wrote:

> Hi,
> I am using Nutch to crawl data from different sources, though it works for
> mostly all the websites but it gives empty result for some sites like
> https://www.google.com/finance.
>
> Fetcher: throughput threshold sequence: 5
> 0/0 spinwaiting/active, 0 pages, 0 errors, 0.0 0 pages/s, 0 0 kb/s, 0 URLs
> in 0 queues
>
>
> This is what I get after crawling.
>
> So I need to add any configurations or any properties to be added.
>
> Thanks in advance.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Nutch-returns-empty-result-set-for-some-websites-tp4147874.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>

Reply via email to