Doug, This an extract from the same crawldb, I am running application from outside of Nutch. Does hadoop checks for anything else or is there a better to iterate over the values in crawldb without going through hadoop (This might actually be a question for the nutch community).
Here is an extract from my crawldb http://dmoz.org/ Version: 4 Status: 2 (DB_fetched) Fetch time: Thu Feb 22 12:44:05 GMT 2007 Modified time: Thu Jan 01 01:00:00 GMT 1970 Retries since fetch: 0 Retry interval: 30.0 days Score: 1.0323955 Signature: f4c14c46074b66aad8829b8aa84cd636 Metadata: null http://dmoz.org/Arts/ Version: 4 Status: 2 (DB_fetched) Fetch time: Thu Feb 22 12:45:43 GMT 2007 Modified time: Thu Jan 01 01:00:00 GMT 1970 Retries since fetch: 0 Retry interval: 30.0 days Score: 0.013471641 Signature: fe52a0bcb1071070689d0f661c168648 Metadata: null Thanks Armel ------------------------------------------------- Armel T. Nene iDNA Solutions Tel: +44 (207) 257 6124 Mobile: +44 (788) 695 0483 http://blog.idna-solutions.com -----Original Message----- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: 23 January 2007 19:08 To: [email protected] Subject: Re: Exception java.lang.ArithmeticException Armel T. Nene wrote: > My problem is when I run my program I get the following errors: > Exception in thread "main" java.lang.ArithmeticException: / by zero > > at > org.apache.hadoop.mapred.lib.HashPartitioner.getPartition(HashPartitioner.ja > va:33) > > at > org.apache.hadoop.mapred.MapFileOutputFormat.getEntry(MapFileOutputFormat.ja > va:88) > > at org.apache.nutch.crawl.CrawlDbReader.get(CrawlDbReader.java:321) > > at It looks like the crawl db path contains no files. Probably MapFileOutputFormat#getReaders() should throw an exception when no paths are listed (if 'names', on line 68, has length zero). That might make this a little easier to debug. So please check the path you're passing for the crawl db. Doug
