[ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amar Kamat updated HADOOP-4053: ------------------------------- Attachment: HADOOP-4053-v3.1.patch Attaching a patch that implements the {{JobChangeEvent}} concept. Here is how it is implemented. _Assumptions :_ Everything that has the potential to change a job's state is captured and bundled under {{JobStatus}}. Hence taking snapshot of job's status before and after the event should be sufficient determine the state change. _Working :_ 1) {{JobInProgressListener.jobUpdated()}} now takes {{JobChangeEvent}} as a parameter. 2) {{JobChangeEvent}} is an abstract class that has just one api, {{getJobInProgress()}}. 3) For the task at hand, i.e handling _priority-change_, _start-time-change_ and _job-runstate-change_, I have extended {{JobChangeEvent}} to {{JobStatusChangeEvent}}. 4) {{JobStateChangeEvent}} hosts a set of _sub-events_ that can lead to job-status change. These are fields from {{JobStatus}} that has a potential to change for a given job. Some of them are _priority, start-time, run-state_ etc. While composing an event, one can specify what all _sub-events_ constitute the state change. Note that the order in which the _sub-events_ are specified is also preserved. 5) For capacity-scheduler, based on the _sub-events_ constituting the state transition, appropriate action is performed. For now the actions are - promote a job from the waiting queue to the running queue - remove a job upon job completion - re-position the job in the queue as the parameters that decide where the job is positioned has changed 6) If {{JobStateChangeEvent}} fails to capture all the events then {{JobChangeEvent}} can be extended to cater that case. 7) Other listener implementations remain unchanged as they just require {{jobInProgress}} which is obtained from {{JobChangeEvent}}. Tested the patch with capacity scheduler and it works fine. The web-ui doesnt show completed jobs in the job queue which means that the job is removed upon completion. _test-patch_ and _ant test_ pass on my box. Rest of the listener implementations should not be affected. This patch is meant for 0.19. > Schedulers need to know when a job has completed > ------------------------------------------------ > > Key: HADOOP-4053 > URL: https://issues.apache.org/jira/browse/HADOOP-4053 > Project: Hadoop Core > Issue Type: Improvement > Affects Versions: 0.19.0 > Reporter: Vivek Ratan > Assignee: Amar Kamat > Fix For: 0.19.0 > > Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, > HADOOP-4053-v3.1.patch > > > The JobInProgressListener interface is used by the framework to notify > Schedulers of when jobs are added, removed, or updated. Right now, there is > no way for the Scheduler to know that a job has completed. jobRemoved() is > called when a job is retired, which can happen many hours after a job is > actually completed. jobUpdated() is called when a job's priority is changed. > We need to notify a listener when a job has completed (either successfully, > or has failed or been killed). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.