You should you hadoop 0.12.3 for example to dedup. The current version
0.14.x don't support Lock operation.

2007/10/18, Matei Zaharia <[EMAIL PROTECTED]>:
>
> Hi,
>
> I'm sometimes getting the following error in the dedup 3 job when
> running Nutch 0.9 on top of Hadoop 0.14.2:
>
> java.io.IOException: Lock obtain timed out: [EMAIL PROTECTED]://r37:54310/
> user/matei/crawl4/indexes/part-00000/write.lock
>         at org.apache.lucene.store.Lock.obtain(Lock.java:69)
>         at org.apache.lucene.index.IndexReader.aquireWriteLock
> (IndexReader.java:526)
>         at org.apache.lucene.index.IndexReader.deleteDocument
> (IndexReader.java:551)
>         at org.apache.nutch.indexer.DeleteDuplicates.reduce
> (DeleteDuplicates.java:378)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:322)
>         at org.apache.hadoop.mapred.TaskTracker$Child.main(
> TaskTracker.java:
> 1782)
>
> Other times, it works just fine. Do you know why this is happening?
>
> Thanks,
>
> Matei Zaharia
>

Reply via email to