Thanks, Ralf! Yes, I figured it out.
On Tue, Oct 29, 2013 at 2:15 AM, Ralf R. Kotowski <[email protected]> wrote: > It is specified INSIDE the crawl script itself! > > -----Original Message----- > From: A Laxmi [mailto:[email protected]] > Sent: Tuesday, October 01, 2013 5:58 PM > To: [email protected] > Subject: Nutch 2.2.1 with HBase crawl command - topN > > Hi, > > I have HBase 2.2.1 with HBase as backend. I am using the all-in-one crawl > command which runs fine - > > *bin/crawl urls 3 http://localhost:8983/solr/ 10 > > * > *crawl <seedDir> <crawlId> <solrURL> <numberOfRounds>* > > My question is - Where do we specify the "*topN*" parameter for the above > all-in-one crawl command? > > topN - maximum number of pages that will be retrieved at each level > >

