RE: How to work depth and topN while crawling

Markus Jelsma Tue, 01 Apr 2014 03:01:56 -0700

Hi - Nutch crawls in cycles. For each cycle it does processes a number of URL's 
and add the newly found links to the DB. In you case you are doing 10 crawl 
cycles with a maximum of 500 URL's for each cycle.


 
 
-----Original message-----
> From:reddibabu <[email protected]>
> Sent: Tuesday 1st April 2014 11:43
> To: [email protected]
> Subject: How to work depth and topN while crawling
> 
> Hi All,
> 
> I have given threads=100, depth=10 and topN=500 and I can able to crawl and
> index 4000 url's from nutch to Solr. But, I did't understand how to process
> internally depth and topN parameters. Either linear way or Exponential, Can
> you please any one explain this one bit clearly that can helps me a lot.
> 
> 
> Thanks in advance.
> Reddi Babu
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/How-to-work-depth-and-topN-while-crawling-tp4128377.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>

RE: How to work depth and topN while crawling

Reply via email to