Thanks -Chethan
On Wed, Jun 6, 2012 at 1:34 PM, Markus Jelsma <[email protected]>wrote: > > -----Original message----- > > From:chethan <[email protected]> > > Sent: Wed 06-Jun-2012 05:12 > > To: [email protected] > > Subject: Nutch topN selection > > > > Hi, > > hi > > > > > Does the topN threshold consider page score for the selection. If it's > set > > to say 10, does Nutch queue up the 10 top scoring URLs on a page? > > Yes. TopN will consider the top URL's that are eligible for fetch. > > > Does this work through the webgraph or is it just the first 10 that it > comes across > > on a page? > > It uses a CrawlDatum's score field which can be updated by the WebGraph. > > > > > Thanks, > > Chethan > > >

