Hi Ismael, Thanks a lot for the response. I did not build nutch from sources. I simply copied the nutch-0.9 release. Would you recommend building from nightly nutch or the nutch-0.09?
Thanks, Manoj. On Jan 13, 2008 4:43 AM, Ismael <[EMAIL PROTECTED]> wrote: > Hello. I apparently had a similar problem when trying to Dedup, I > solved it updating nutch with the following patch > > http://www.mail-archive.com/[EMAIL PROTECTED]/msg06705.html > > I hope this will help you, good luck! > > 2008/1/13, Manoj Bist <[EMAIL PROTECTED]>: > > Hi, > > > > I am getting the following exception when I do a crawl using nutch. I am > > kind of stuck due to this. I would really appreciate any pointers in > > resolving this. I got a related mail thread here > > <http://www.mail-archive.com/[email protected]/msg07745.htm > >but > > it doesn't describe a solution to the problem. > > > > Exception in thread "main" java.io.IOException: Job failed! > > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604) > > > at org.apache.nutch.indexer.DeleteDuplicates.dedup( > > DeleteDuplicates.java:439) > > at org.apache.nutch.crawl.Crawl.main(Crawl.java:135) > > > > I looked at hadoop.log and it has the following stack trace. > > > > mapred.TaskTracker - Error running child > > java.lang.ArrayIndexOutOfBoundsException: -1 > > at org.apache.lucene.index.MultiReader.isDeleted( > MultiReader.java > > :113) > > at > > > org.apache.nutch.indexer.DeleteDuplicates$InputFormat$DDRecordReader.next( > > DeleteDuplicates.java:176) > > at org.apache.hadoop.mapred.MapTask$1.next(MapTask.java:157) > > at org.apache.hadoop.mapred.MapRunner.run (MapRunner.java:46) > > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:175) > > at org.apache.hadoop.mapred.TaskTracker$Child.main( > TaskTracker.java > > :1445) > > > > > > Thanks, > > > > Manoj. > > > > -- > > Tired of reading blogs? Listen to your favorite blogs at > > http://www.blogbard.com !!!! > > > -- Tired of reading blogs? Listen to your favorite blogs at http://www.blogbard.com !!!!
