This is the error I keep getting whenever I try to fetch more than 400K files at a time using a 4 node hadoop cluster running nutch 1.0.

org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /user/hadoop/crawl/segments/20091013161641/crawl_fetch/ part-00015/index for DFSClient_attempt_200910131302_0011_r_000015_2 on client 192.168.1.201 because current leaseholder is trying to recreate file.

Can anybody shed some light on this issue? I was under the impression that 400K was small potatoes for a nutch hadoop combo?

Thanks,


Eric Osgood
---------------------------------------------
Cal Poly - Computer Engineering, Moon Valley Software
---------------------------------------------
eosg...@calpoly.edu, e...@lakemeadonline.com
---------------------------------------------
www.calpoly.edu/~eosgood, www.lakemeadonline.com

Reply via email to