Dedup tries to acquire lock to the file system from IndexReader.java of
lucene. Latest versions of hadoop dont support this. I think this error is
because of that. What you can try is comment out acquire lock functionality
in lucene IndexReader, compile lucene and replace the old lucene jar with new
one. It might work (worked for me), but this may be a problem when you run
concurrent jobs.
- Prasad.
On Friday 23 November 2007 03:10, jibjoice wrote:
> when i use this command "${HADOOP_HOME}/bin/hadoop jar
> ${NUTCHWAX_HOME}/nutchwax.jar all /tmp/inputs /tmp/outputs test"
> i have error :
>
> - LinkDb: done
> - indexing [Lorg.apache.hadoop.fs.Path;@66e64686
> - Indexer: starting
> - Indexer: linkdb: outputs/linkdb
> - parsing file:/nutch/search/conf/hadoop-default.xml
> - parsing file:/nutch/search/conf/nutch-default.xml
> - parsing file:/tmp/hadoop-unjar50228/wax-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - adding segment: /user/nutch/outputs/segments/25501123143257-test
> - Running job: job_0004
> - map 0% reduce 0%
> - map 5% reduce 0%
> - map 15% reduce 0%
> - map 27% reduce 0%
> - map 37% reduce 0%
> - map 47% reduce 0%
> - map 57% reduce 0%
> - map 72% reduce 0%
> - map 82% reduce 0%
> - map 94% reduce 0%
> - map 97% reduce 0%
> - map 100% reduce 0%
> - map 100% reduce 37%
> - map 100% reduce 79%
> - map 100% reduce 87%
> - map 100% reduce 100%
> - Job complete: job_0004
> - Indexer: done
> - dedup outputs/index
> - Dedup: starting
> - parsing file:/nutch/search/conf/hadoop-default.xml
> - parsing file:/nutch/search/conf/nutch-default.xml
> - parsing file:/tmp/hadoop-unjar50228/wax-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - Dedup: adding indexes in: outputs/indexes
> - Running job: job_0005
> - map 0% reduce 0%
> - map 100% reduce 100%
> Exception in thread "main" java.io.IOException: Job failed!
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:399)
> at
> org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplicates.java:433)
> at org.archive.access.nutch.Nutchwax.doDedup(Nutchwax.java:257)
> at org.archive.access.nutch.Nutchwax.doAll(Nutchwax.java:156)
> at org.archive.access.nutch.Nutchwax.doJob(Nutchwax.java:389)
> at org.archive.access.nutch.Nutchwax.main(Nutchwax.java:674)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:3
>9) at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImp
>l.java:25) at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>
> why and how i solve it?