[
https://issues.apache.org/jira/browse/MAPREDUCE-734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729052#action_12729052
]
Hemanth Yamijala commented on MAPREDUCE-734:
--------------------------------------------
This is looking fine.
- I would recommend we move the call to cancelReservedSlots into garbageCollect
rather than in jobComplete and terminate where they are currently defined. The
reason being there is another API terminateJob() which can be called to end a
job as well. In that case too, we'll need to cancel the reservations. Rather
than adding at a new place, I think we can instead move all the calls to
garbageCollect which is guaranteed to be called in all cases. (I confirmed this
by checking with the M/R team).
- It would be good to add a test case to it. The simplest way is to use the
mock object facilities being added now. For instance, I think we can use
FakeObjectUtilities.FakeJobInProgress, create a bunch of TaskTracker objects
and reserve slots in them for the FakeJobInProgress we create. Then we can
finish the job which should trigger calls to unreserve the trackers.
Arun, in order to save on time (since we need to run all the tests etc) I've
requested Sreekanth to look at the test case.
> java.util.ConcurrentModificationException observed in unreserving slots for
> HiRam Jobs
> --------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-734
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-734
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/capacity-sched
> Affects Versions: 0.21.0
> Reporter: Karam Singh
> Assignee: Arun C Murthy
> Fix For: 0.21.0
>
> Attachments: MAPREDUCE-734_0_20090708.patch,
> MAPREDUCE-734_0_20090708_yhadoop20.patch
>
>
> Ran jobs out which 3 were HiRAM, the job were not removed from scheduler
> queue even after they successfully completed
> hadoop queue -info queue -showJobs displays somwthing like -:
> job_200907080724_0031 2 1247059146868 username NORMAL 0 running
> map tasks using 0 map slots. 0 additional slots reserved. 0 running reduce
> tasks using 0 reduce slots. 60 additional slots reserved.
> job_200907080724_0030 2 1247059146972 username NORMAL 0 running
> map tasks using 0 map slots. 0 additional slots reserved. 0 running reduce
> tasks using 0 reduce slots. 60 additional slots reserved.
> But it does not block anything, but seems like zombie process of system
> Jobtracker log show java.util.ConcurrentModificationException
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.