[ 
https://issues.apache.org/jira/browse/HADOOP-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-5436.
--------------------------------------

    Resolution: Fixed

> job history directory grows without bound, locks up job tracker on new job 
> submission
> -------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5436
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5436
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.19.0, 0.20.0, 0.20.1, 0.20.2
>            Reporter: Tim Williamson
>         Attachments: HADOOP-5436.patch
>
>
> An unpleasant surprise upgrading to 0.19: requests to jobtracker.jsp would 
> take a long time or even time out whenever new jobs where submitted.  
> Investigation showed the call to JobInProgress.initTasks() was calling 
> JobHistory.JobInfo.logSubmitted() which in turn was calling 
> JobHistory.getJobHistoryFileName() which was pegging the CPU for a couple 
> minutes.  Further investigation showed the were 200,000+ files in the job 
> history folder -- and every submission was creating a FileStatus for them 
> all, then applying a regular expression to just the name.  All this just on 
> the off chance the job tracker had been restarted (see HADOOP-3245).  To make 
> matters worse, these files cannot be safely deleted while the job tracker is 
> running, as the disappearance of a history file at the wrong time causes a 
> FileNotFoundException.
> So to summarize the issues:
> - having Hadoop default to storing all the history files in a single 
> directory is a Bad Idea
> - doing expensive processing of every history file on every job submission is 
> a Worse Idea
> - doing expensive processing of every history file on every job submission 
> while holding a lock on the JobInProgress object and thereby blocking the 
> jobtracker.jsp from rendering is a Terrible Idea (note: haven't confirmed 
> this, but a cursory glance suggests that's what's going on)
> - not being able to clean up the mess without taking down the job tracker is 
> just Unfortunate



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to