Alright, then you can safely remove directories older than 24 hours using the 
find command, assuming no job runs for that ridiculous amount of time :) 
 
-----Original message-----
> From:yann <[email protected]>
> Sent: Friday 27th December 2013 18:24
> To: [email protected]
> Subject: RE: Too many links in hadoop directory
> 
> Hi Markus,
> 
> thanks for your answer - however, this isn't a good option for me, as I'm
> running a Nutch server with multiple instances crawling multiple sites. 
> 
> From the Nutch API, I can't know which folders under the "jobcache"
> directory belong to a crawl that has just completed, vs belong to other
> still ongoing crawls.
> 
> Or can I?
> 
> Thanks
> 
> Yann
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Too-many-links-in-hadoop-directory-tp4108378p4108393.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
> 

Reply via email to