Forgot the link.

github.com/edwardcapriolo/filecrush

On 6/1/12, Edward Capriolo <edlinuxg...@gmail.com> wrote:
> The filecrush tool has a small utility called Clean that accepts and
> age argument and deletes all the files in a directory older then a
> certain time.
>
> We use clean to clean up the tmp hdfs directories applications leave
> remnants in.
>
> Edward
>
> On 6/1/12, Vinod Singh <vi...@vinodsingh.com> wrote:
>> Yes, that is how I do. Though 1 month is too long, I keep it just 2 days.
>>
>> Thanks,
>> Vinod
>>
>> http://blog.vinodsingh.com/
>>
>> On Fri, Jun 1, 2012 at 2:15 PM, Ruben de Vries
>> <ruben.devr...@hyves.nl>wrote:
>>
>>> So I should write a job which cleans up 1 month old results or something
>>> like that?
>>>
>>> From: Vinod Singh [mailto:vi...@vinodsingh.com]
>>> Sent: Friday, June 01, 2012 10:35 AM
>>> To: user@hive.apache.org
>>> Subject: Re: Hive scratch dir not cleaning up
>>>
>>> Hive deletes job contents from the scratch directory on completion of
>>> the
>>> job. Though failed / killed jobs leave data there, which needs to be
>>> removed manually.
>>>
>>> Thanks,
>>> Vinod
>>>
>>> http://blog.vinodsingh.com/
>>> On Fri, Jun 1, 2012 at 1:58 PM, Ruben de Vries <ruben.devr...@hyves.nl>
>>> wrote:
>>> Hey Hivers,
>>>
>>> I’m almost ready to replace our old hadoop implementation with a
>>> implementation using Hive,
>>>
>>> Now I’ve ran into (hopefully) my last problem; my /tmp/hive-hduser dir
>>> is
>>> getting kinda big!
>>> It doesn’t seem to cleanup this tmp files, googling for it I run into
>>> some
>>> tickets about a cleanup setting, should I enable this with the below
>>> setting?
>>> Why doesn’t it do that by default? Am I the only one somehow racking up
>>> a
>>> lot of space with tmp files?
>>>
>>>
>>>
>>>
>>> <property>
>>>   <name>hive.start.cleanup.scratchdir</name>
>>>   <value>true</value>
>>> </property>
>>>
>>>
>>
>

Reply via email to