Hello All,
I am trying to run nutch (1.5.0 binaries) on hadoop, but when i am seeing
the db stats it shows me the following output:

CrawlDb statistics start: finalCrawl/crawldb
Statistics for CrawlDb: finalCrawl/crawldb
TOTAL urls:    6
retry 1:    6
min score:    1.0
avg score:    1.0
max score:    1.0
status 1 (db_unfetched):    6
CrawlDb statistics: done

What i did is copied entire files from Nutch conf directory to Hadoop conf
directory

I have rename nutch-default.xml to nutch-site.xml. And also provided
plugins.folder properties, plugins.includes, http.agent.name etc..

Crawling is successful, but when i am trying to dump the crawldb contents
it shows me nothing and when i do stats on crawldb  it tells me that
nothing is fetched..

But when i do crawl using nutch only, everything is fine..

Any suggestion what is going wrong?

Regards,
Som

Reply via email to