No need to open an issue...
It seems that the svn version already addressed this problem! Thanks
nutch developes!

Cheers,
Enrico

On 4/10/06, Enrico Triolo <[EMAIL PROTECTED]> wrote:
> I definetly agree with you, I have hundreds of Mb in my /tmp/hadoop dir.
> I think we should open an issue in jira; if you want I could open one.
>
> Enrico
>
> On 4/3/06, Raghavendra Prabhu <[EMAIL PROTECTED]> wrote:
> > Hi
> >
> > I have been raising this point for quite a time
> >
> > Right now when we have a new job, we store the job.jar and job.xml files in
> > the job tracker. The task tracker if i am right uses this job.jar and
> > job.xml files
> >
> > Should'nt we clean up after the job has been complete( that is purge these
> > files). The purging of these files can be done immediately when the job gets
> > complete
> >
> > I find that the these files consume a lot of disk space and have to be
> > deleted.
> >
> > Has anyone else noticed the same problem. At the end of a single crawl i
> > find that this has taken up atleast 30 mb of disk space.
> >
> > There are two alternatives
> > 1) delete the temporary files with a shell script
> > 2) clean it up as and then (when the job completes have a clean up in code)
> >
> > The problem with option 1 is there may be instances of nutch running in
> > which case the shell script should not attempt to delete files as another
> > instance is running.
> > So it is better to stick with option 2
> >
> > Rgds
> > Prabhu
> >
> >
>


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to