[
https://issues.apache.org/jira/browse/MAPREDUCE-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinod K V updated MAPREDUCE-1533:
---------------------------------
Hadoop Flags: [Reviewed]
Fix Version/s: 0.22.0
> Reduce or remove usage of String.format() usage in
> CapacityTaskScheduler.updateQSIObjects and Counters.makeEscapedString()
> --------------------------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-1533
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1533
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: jobtracker
> Affects Versions: 0.20.1
> Reporter: Rajesh Balamohan
> Assignee: Dick King
> Fix For: 0.22.0
>
> Attachments: mapreduce-1533--2010-05-10a.patch,
> mapreduce-1533--2010-05-21.patch, mapreduce-1533--2010-05-21a.patch,
> mapreduce-1533--2010-05-24.patch, MAPREDUCE-1533-and-others-20100413.1.txt,
> MAPREDUCE-1533-and-others-20100413.bugfix.txt, mapreduce-1533-v1.4.patch,
> mapreduce-1533-v1.8.patch
>
>
> When short jobs are executed in hadoop with OutOfBandHeardBeat=true, JT
> executes heartBeat() method heavily. This internally makes a call to
> CapacityTaskScheduler.updateQSIObjects().
> CapacityTaskScheduler.updateQSIObjects(), internally calls String.format()
> for setting the job scheduling information. Based on the datastructure size
> of "jobQueuesManager" and "queueInfoMap", the number of times String.format()
> gets executed becomes very high. String.format() internally does pattern
> matching which turns to be out very heavy (This was revealed while profiling
> JT. Almost 57% of time was spent in CapacityScheduler.assignTasks(), out of
> which String.format() took 46%.
> Would it be possible to do String.format() only at the time of invoking
> JobInProgress.getSchedulingInfo?. This might reduce the pressure on JT while
> processing heartbeats.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.