[ https://issues.apache.org/jira/browse/MAPREDUCE-927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12830889#action_12830889 ]
Amareshwari Sriramadasu commented on MAPREDUCE-927: --------------------------------------------------- I would propose the following to solve this: * Task logs directory hierarchy is currently userlogs/attemptid. Instead, it should be userlogs/jobid/attemptid. * TaskTracker has a thread for deleting task logs. Whenever TaskTracker gets a KillJobAction, it adds an entry (jobid, timestamp) to delete tasklogs i.e. (jobId, jobCompletionTime + userLogRetainsHours). userlogs/jobid directory will be deleted at/after the timestamp (jobCompletionTime + userLogretainshours). This says user logs will be maintained for userLogRetainsHours duration after the job completion. We can have userLogRetainsHours as a TaskTracker's parameter, since it controls the disk space of the tracker. Or we can have it as a job level parameter as it is now, since they are maintained for user to access. If it is a job level parameter, we would need a TaskTracker parameter as an upper bound for the job configuration to control the disk space. Thoughts? > Cleanup of task-logs should happen in TaskTracker instead of the Child > ---------------------------------------------------------------------- > > Key: MAPREDUCE-927 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-927 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: tasktracker > Affects Versions: 0.21.0 > Reporter: Vinod K V > Priority: Blocker > Fix For: 0.21.0 > > > Task logs' cleanup is being done in Child now. This is undesirable atleast > for two reasons: 1) failures while cleaning up will affect the user's tasks, > and 2) the task's wall time will get affected due to operations that TT > actually should own. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.