Hi, You need to have the nutch job file (with all configuration in there) on your hadoop classpath. You then run hadoop -jar $nutch.jar $nutch.class -params this shoudl be all that is required.
Lewis On Sat, Jul 7, 2012 at 12:06 AM, shekhar sharma <[email protected]> wrote: > Hello All, > I am trying to run nutch (1.5.0 binaries) on hadoop, but when i am seeing > the db stats it shows me the following output: > > CrawlDb statistics start: finalCrawl/crawldb > Statistics for CrawlDb: finalCrawl/crawldb > TOTAL urls: 6 > retry 1: 6 > min score: 1.0 > avg score: 1.0 > max score: 1.0 > status 1 (db_unfetched): 6 > CrawlDb statistics: done > > What i did is copied entire files from Nutch conf directory to Hadoop conf > directory > > I have rename nutch-default.xml to nutch-site.xml. And also provided > plugins.folder properties, plugins.includes, http.agent.name etc.. > > Crawling is successful, but when i am trying to dump the crawldb contents > it shows me nothing and when i do stats on crawldb it tells me that > nothing is fetched.. > > But when i do crawl using nutch only, everything is fine.. > > Any suggestion what is going wrong? > > Regards, > Som -- Lewis

