Re: Error when i index by nutchwax-0.10.0

pvvpr Fri, 23 Nov 2007 04:17:02 -0800

Dedup tries to acquire lock to the file system from IndexReader.java of 
lucene. Latest versions of hadoop dont support this. I think this error is 
because of that. What you can try is comment out acquire lock functionality 
in lucene IndexReader, compile lucene and replace the old lucene jar with new 
one. It might work (worked for me), but this may be a problem when you run 
concurrent jobs.


- Prasad.

On Friday 23 November 2007 03:10, jibjoice wrote:
> when i use this command "${HADOOP_HOME}/bin/hadoop jar
> ${NUTCHWAX_HOME}/nutchwax.jar all /tmp/inputs /tmp/outputs test"
> i have error :
>
> - LinkDb: done
> -  indexing [Lorg.apache.hadoop.fs.Path;@66e64686
> - Indexer: starting
> - Indexer: linkdb: outputs/linkdb
> - parsing file:/nutch/search/conf/hadoop-default.xml
> - parsing file:/nutch/search/conf/nutch-default.xml
> - parsing file:/tmp/hadoop-unjar50228/wax-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - adding segment: /user/nutch/outputs/segments/25501123143257-test
> - Running job: job_0004
> -  map 0% reduce 0%
> -  map 5% reduce 0%
> -  map 15% reduce 0%
> -  map 27% reduce 0%
> -  map 37% reduce 0%
> -  map 47% reduce 0%
> -  map 57% reduce 0%
> -  map 72% reduce 0%
> -  map 82% reduce 0%
> -  map 94% reduce 0%
> -  map 97% reduce 0%
> -  map 100% reduce 0%
> -  map 100% reduce 37%
> -  map 100% reduce 79%
> -  map 100% reduce 87%
> -  map 100% reduce 100%
> - Job complete: job_0004
> - Indexer: done
> - dedup outputs/index
> - Dedup: starting
> - parsing file:/nutch/search/conf/hadoop-default.xml
> - parsing file:/nutch/search/conf/nutch-default.xml
> - parsing file:/tmp/hadoop-unjar50228/wax-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - Dedup: adding indexes in: outputs/indexes
> - Running job: job_0005
> -  map 0% reduce 0%
> -  map 100% reduce 100%
> Exception in thread "main" java.io.IOException: Job failed!
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:399)
>         at
> org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java:433)
>         at org.archive.access.nutch.Nutchwax.doDedup(Nutchwax.java:257)
>         at org.archive.access.nutch.Nutchwax.doAll(Nutchwax.java:156)
>         at org.archive.access.nutch.Nutchwax.doJob(Nutchwax.java:389)
>         at org.archive.access.nutch.Nutchwax.main(Nutchwax.java:674)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:3
>9) at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImp
>l.java:25) at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>
> why and how i solve it?

Re: Error when i index by nutchwax-0.10.0

Reply via email to