Hi,

i'm running nutch with hadoop nightly build and everything works fine
except the dedup job. I'm getting "Lock obtain timed out" all the way in
DeleteDuplicates.reduce() after calling reader.deleteDocument(value.get()).
I have 4 servers doing job in parallel through hadoop.

here the error message:

java.io.IOException: Lock obtain timed out:
[EMAIL 
PROTECTED]://xxx.xxx.xxx.xxx:9000/user/nutch/crawl/indexes/part-00020/write.lock
        at org.apache.lucene.store.Lock.obtain(Lock.java:69)
        at 
org.apache.lucene.index.IndexReader.aquireWriteLock(IndexReader.java:526)
        at 
org.apache.lucene.index.IndexReader.deleteDocument(IndexReader.java:551)
        at 
org.apache.nutch.indexer.DeleteDuplicates.reduce(DeleteDuplicates.java:451)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:323)
        at 
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1763)


What can I do to avoid this problem?

thx

des
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Nutch-general mailing list
Nutch-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to