[
https://issues.apache.org/jira/browse/HADOOP-5436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amar Kamat resolved HADOOP-5436.
--------------------------------
Resolution: Fixed
I think locking the jobtracker cannot be avoided as its inline with heartbeat.
MAPREDUCE-786 should make the JobHistory calls faster.
> job history directory grows without bound, locks up job tracker on new job
> submission
> -------------------------------------------------------------------------------------
>
> Key: HADOOP-5436
> URL: https://issues.apache.org/jira/browse/HADOOP-5436
> Project: Hadoop Common
> Issue Type: Bug
> Affects Versions: 0.19.0
> Reporter: Tim Williamson
> Attachments: HADOOP-5436.patch
>
>
> An unpleasant surprise upgrading to 0.19: requests to jobtracker.jsp would
> take a long time or even time out whenever new jobs where submitted.
> Investigation showed the call to JobInProgress.initTasks() was calling
> JobHistory.JobInfo.logSubmitted() which in turn was calling
> JobHistory.getJobHistoryFileName() which was pegging the CPU for a couple
> minutes. Further investigation showed the were 200,000+ files in the job
> history folder -- and every submission was creating a FileStatus for them
> all, then applying a regular expression to just the name. All this just on
> the off chance the job tracker had been restarted (see HADOOP-3245). To make
> matters worse, these files cannot be safely deleted while the job tracker is
> running, as the disappearance of a history file at the wrong time causes a
> FileNotFoundException.
> So to summarize the issues:
> - having Hadoop default to storing all the history files in a single
> directory is a Bad Idea
> - doing expensive processing of every history file on every job submission is
> a Worse Idea
> - doing expensive processing of every history file on every job submission
> while holding a lock on the JobInProgress object and thereby blocking the
> jobtracker.jsp from rendering is a Terrible Idea (note: haven't confirmed
> this, but a cursory glance suggests that's what's going on)
> - not being able to clean up the mess without taking down the job tracker is
> just Unfortunate
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.