-----Original message----- > From:chethan <[email protected]> > Sent: Wed 06-Jun-2012 05:12 > To: [email protected] > Subject: Nutch topN selection > > Hi,
hi > > Does the topN threshold consider page score for the selection. If it's set > to say 10, does Nutch queue up the 10 top scoring URLs on a page? Yes. TopN will consider the top URL's that are eligible for fetch. > Does this work through the webgraph or is it just the first 10 that it comes > across > on a page? It uses a CrawlDatum's score field which can be updated by the WebGraph. > > Thanks, > Chethan >

