JobInProgress.garbageCollect delete <job-dir> twice.
----------------------------------------------------
Key: HADOOP-3214
URL: https://issues.apache.org/jira/browse/HADOOP-3214
Project: Hadoop Core
Issue Type: Bug
Reporter: Tsz Wo (Nicholas), SZE
In JobInProgress.garbageCollect, the following codes delete <job-dir> twice.
{code}
// JobClient always creates a new directory with job files
// so we remove that directory to cleanup
FileSystem fs = FileSystem.get(conf);
fs.delete(new Path(profile.getJobFile()).getParent(), true);
// Delete temp dfs dirs created if any, like in case of
// speculative exn of reduces.
Path tempDir = new Path(conf.getSystemDir(), jobId);
fs.delete(tempDir, true);
{code}
Below is the clean-up trace copied from HADOOP-3182:
* FileSystem.delete <job-dir> by JobTracker as user_account
at
org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1637)
at
org.apache.hadoop.mapred.JobInProgress.isJobComplete(JobInProgress.java:1396)
at
org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:1357)
at
org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:565)
at
org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2270)
<job-dir> is obtained by *profile.getJobFile().getParent()*
* FileSystem.delete <job-dir> again by JobTracker as user_account
at
org.apache.hadoop.mapred.JobInProgress.garbageCollect(JobInProgress.java:1642)
at
org.apache.hadoop.mapred.JobInProgress.isJobComplete(JobInProgress.java:1396)
at
org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:1357)
at
org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:565)
at
org.apache.hadoop.mapred.JobTracker$TaskCommitQueue.run(JobTracker.java:2270)
<job-dir> is obtained by *new Path(conf.getSystemDir(), jobId)*
Is there any case that these two paths are distinct?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.