Re: FW: Removing files after processing

Doug Cutting Tue, 28 Aug 2007 13:38:50 -0700

I think this is related to HADOOP-1558:

https://issues.apache.org/jira/browse/HADOOP-1558

Per-job cleanups that are not run clientside must be run in a separateJVM, since we, as a rule, don't run user code in long-lived daemons.


Doug

Stu Hood wrote:


Does anyone have any ideas on this issue?

Otherwise, if I were to write a patch to add this option for jobs to Hadoop, 
would it be useful for anyone else?

Thanks
Stu

-----Original Message-----
From: Stu Hood <[EMAIL PROTECTED]>
Sent: Fri, August 24, 2007 9:43 am
To: [email protected]
Subject: Removing files after processing


Hello,

Whats the best way to go about doing cleanup after MapReduce jobs? I'd like to 
have the job delete its input files when it has finished successfully (but 
preferably before it is marked as having finished: so I don't have to deal with 
a race condition).

Obviously, I don't want to have to track which files are being processed for 
each job, since that data is stored anyway. Also, I'm using 
JobClient.submitJob(), so I can't sit around and wait to do the cleanup 
manually.

Any suggestions?

Thanks!

Stu Hood
Webmail.us
"You manage your business. We'll manage your email."®

Re: FW: Removing files after processing

Reply via email to