Thanks for this tipp, I have now adapted my hadoop-site.xml to use a big disk 
for temporary usage. 

Regards


Andrzej Bialecki <[EMAIL PROTECTED]> wrote: ML mail wrote:
> Thanks for your answer! So I will move on and use the latest nightly build 
> instead of the 0.9 stable version. Hopefully is nightly build stable enough 
> to use in a production environment.
> 
> 
> Lyndon Maydwell  wrote: From what I have read, this has been solved in recent 
> revisions, so
> downloading a new build or checking out the latest source should solve
> the problem. I am still using a version that has this problem, but
> should be switching shortly. My solution in the mean time has been to
> delete the temporary files after crawling. This works for me, and I
> suspect it is due to the failure of Nutch to delete files.
> 

In fact, I doubt this would solve your problem. The latest trunk doesn't 
change in any significant way the temporary space usage, so if you ran 
out of space before, you would do the same with the latest nightly build.

The solution is to configure Hadoop to use a different place than /tmp 
for temporary files, a place where you have enough disk space to fit all 
downloaded and temporary data. You can configure this by adding the 
following to conf/hadoop-site.xml:



 hadoop.tmp.dir
 /my/large/disk/space/hadoop-${user.name}



(if you run Hadoop in non-local mode, you need to restart the cluster).

-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



 __________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

Reply via email to