Re: Exception when crawling

dealmaker Wed, 04 Mar 2009 15:55:04 -0800

I have similar problem with nightly build #741 (Mar 3, 2009 4:01:53 AM). 
What's wrong?


Log from Hadoop:
2009-03-04 14:30:31,531 WARN  mapred.LocalJobRunner - job_local_0001
java.lang.IllegalArgumentException: it doesn't make sense to have a field
that is neither indexed nor stored
        at org.apache.lucene.document.Field.<init>(Field.java:279)
        at
org.apache.nutch.indexer.lucene.LuceneWriter.createLuceneDoc(LuceneWriter.java:133)
        at
org.apache.nutch.indexer.lucene.LuceneWriter.write(LuceneWriter.java:239)
        at
org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:50)
        at
org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:40)
        at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:410)
        at
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:158)
        at
org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:50)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:436)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:170)
2009-03-04 14:30:31,668 FATAL indexer.Indexer - Indexer:
java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
        at org.apache.nutch.indexer.Indexer.index(Indexer.java:72)
        at org.apache.nutch.indexer.Indexer.run(Indexer.java:92)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.indexer.Indexer.main(Indexer.java:101)


$ bin/nutch index crawl/indexes crawl/crawldb crawl/linkdb crawl/segments/*
Indexer: starting
Indexer: java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
        at org.apache.nutch.indexer.Indexer.index(Indexer.java:72)
        at org.apache.nutch.indexer.Indexer.run(Indexer.java:92)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.indexer.Indexer.main(Indexer.java:101) 



yanky young wrote:
> 
> You can look at your hadoop log file, there are more details of exception
> info.
> 
> 
> good luck
> 
> yanky
> 
> 
> 2009/3/2 Tony Wang <[email protected]>
> 
>> I just installed the nightly build (March 1, 2009) on my dedicated server
>> and I tried to craw a single site, but it throws below exception:
>>
>> Exception in thread "main" java.io.IOException: Job failed!
>>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
>>        at org.apache.nutch.indexer.Indexer.index(Indexer.java:72)
>>        at org.apache.nutch.crawl.Crawl.main(Crawl.java:146)
>>
>> I wonder what is the cause? anyone has seen this error before? thanks
>>
>> Tony
>>
>> --
>> Are you RCholic? www.RCholic.com
>> 温 良 恭 俭 让 仁 义 礼 智 信
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Exception-when-crawling-tp22279244p22342401.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Exception when crawling

Reply via email to