I want to find out what the crawldb knows about some specific urls. According to the nutch wiki, I should use nutch readdb with the -url option. But when I do a command like the following, I get nasty "can't find class" exceptions.
$NUTCH_HOME/runtime/deploy/bin/nutch readdb /crawls/popular/data/crawldb -url http://fabulous.com/ The error message isException in thread "main" java.io.IOException: can't find class: org.apache.nutch.protocol.ProtocolStatus because org.apache.nutch.protocol.ProtocolStatus at org.apache.hadoop.io.AbstractMapWritable.readFields(AbstractMapWritable.java:212) at org.apache.hadoop.io.MapWritable.readFields(MapWritable.java:167) at org.apache.nutch.crawl.CrawlDatum.readFields(CrawlDatum.java:317) at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:2256) at org.apache.hadoop.io.MapFile$Reader.get(MapFile.java:680) at org.apache.hadoop.mapred.MapFileOutputFormat.getEntry(MapFileOutputFormat.java:99) at org.apache.nutch.crawl.CrawlDbReader.get(CrawlDbReader.java:465) at org.apache.nutch.crawl.CrawlDbReader.readUrl(CrawlDbReader.java:472) at org.apache.nutch.crawl.CrawlDbReader.run(CrawlDbReader.java:717) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.nutch.crawl.CrawlDbReader.main(CrawlDbReader.java:736) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) The above message occurs for any url that is actually in the crawldb. If I specify a url that does not exist, I get a more understandable message. Also, nutch readdb -stats works reliably. How can we make this work?

