Re: Why does Nutch crawl keep on throwing an exception?

Micah Vivion Tue, 31 Jul 2007 19:10:04 -0700

Greeting,

Here is the hadoop.log output from my crash - any ideas?

2007-07-31 19:06:50,702 INFO indexer.IndexingFilters - Addingorg.apache.nutch.indexer.basic.BasicIndexingFilter

2007-07-31 19:06:50,799 INFO  indexer.Indexer - Optimizing index.
2007-07-31 19:06:51,497 INFO  indexer.Indexer - Indexer: done
2007-07-31 19:06:51,498 INFO  indexer.DeleteDuplicates - Dedup: starting

2007-07-31 19:06:51,510 INFO indexer.DeleteDuplicates - Dedup:adding indexes in: /var/webindex/data/indexes

2007-07-31 19:06:51,733 WARN  mapred.LocalJobRunner - job_2xsg2o
java.lang.ArrayIndexOutOfBoundsException: -1

at org.apache.lucene.index.MultiReader.isDeleted(MultiReader.java:113)at org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next(DeleteDuplicates.java:176)

        at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:46)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175)

at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:126)







On Jul 30, 2007, at 2:02 PM, DES wrote:

Look in logs/hadoop.log for the actual reason for this exception. The
console message is not really helpful.

On 7/30/07, Micah Vivion <[EMAIL PROTECTED]> wrote:

Exception in thread "main" java.io.IOException: Job failed!

at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:

604)
         at org.apache.nutch.indexer.DeleteDuplicates.dedup
(DeleteDuplicates.java:439)
         at org.apache.nutch.crawl.Crawl.main(Crawl.java:135)

Re: Why does Nutch crawl keep on throwing an exception?

Reply via email to