Exception in DeleteDuplicates.java

Manoj Bist Sat, 12 Jan 2008 19:39:46 -0800

Hi,

I am getting the following exception when I do a crawl using nutch. I am
kind of stuck due to this.  I would really appreciate any pointers in
resolving this. I got a related mail thread here
<http://www.mail-archive.com/[email protected]/msg07745.htm>but
it doesn't describe a solution to the problem.


Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
        at org.apache.nutch.indexer.DeleteDuplicates.dedup(
DeleteDuplicates.java:439)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)

I looked at hadoop.log and it has the following stack trace.

 mapred.TaskTracker - Error running child
java.lang.ArrayIndexOutOfBoundsException: -1
        at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java
:113)
        at
org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next(
DeleteDuplicates.java:176)
        at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175)
        at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java
:1445)


Thanks,

Manoj.

-- 
Tired of reading blogs? Listen to  your favorite blogs at
http://www.blogbard.com   !!!!

Exception in DeleteDuplicates.java

Reply via email to