Hi , I am getting weird error on DBUpdater Job in Nutch2.x. I am crawling these two links
http://www.amazon.com/Degree-Antiperspirant-Deodorant-Extreme-Blast/dp/B001ET769Y http://www.amazon.com/Cisco-WAP4410N-Wireless-N-Access-Point/dp/B001IYCMNA And my all jobs are running fine , when I run my dpupdate job I get this error Exception in thread "main" java.lang.RuntimeException: job failed: name=update-table, jobid=job_local482736560_0001 at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54) at org.apache.nutch.crawl.DbUpdaterJob.run(DbUpdaterJob.java:98) at org.apache.nutch.crawl.DbUpdaterJob.updateTable(DbUpdaterJob.java:105) at org.apache.nutch.crawl.DbUpdaterJob.run(DbUpdaterJob.java:119) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.nutch.crawl.DbUpdaterJob.main(DbUpdaterJob.java:123) And hadoop log file says 2013-06-17 21:51:41,478 WARN mapred.FileOutputCommitter - Output path is null in cleanup 2013-06-17 21:51:41,479 WARN mapred.LocalJobRunner - job_local384125843_0001 java.lang.IndexOutOfBoundsException at java.nio.Buffer.checkBounds(Buffer.java:559) at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:143) at org.apache.avro.ipc.ByteBufferInputStream.read(ByteBufferInputStream.java:52) at org.apache.avro.io.DirectBinaryDecoder.doReadBytes(DirectBinaryDecoder.java:183) at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:265) at org.apache.gora.mapreduce.FakeResolvingDecoder.readString(FakeResolvingDecoder.java:131) And if I crawl simple page like www.google.nl .. every thing works fine , including dbupdate job !!! Any clues how to debug this issues ? what could be the reason for this ? Thanks. Tony.