[ 
https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated HADOOP-4053:
-------------------------------

    Attachment: HADOOP-4053-v3.1.patch

Attaching a patch that implements the {{JobChangeEvent}} concept. Here is how 
it is implemented.

_Assumptions :_
Everything that has the potential to change a job's state is captured and 
bundled under {{JobStatus}}. Hence taking snapshot of job's status before and 
after the event should be sufficient determine the state change.

_Working :_
1) {{JobInProgressListener.jobUpdated()}} now takes {{JobChangeEvent}} as a 
parameter.

2) {{JobChangeEvent}} is an abstract class that has just one api, 
{{getJobInProgress()}}.

3) For the task at hand, i.e handling _priority-change_, _start-time-change_ 
and _job-runstate-change_, I have extended {{JobChangeEvent}} to 
{{JobStatusChangeEvent}}. 

4) {{JobStateChangeEvent}} hosts a set of _sub-events_ that can lead to 
job-status change. These are fields from {{JobStatus}} that has a potential to 
change for a given job. Some of them are _priority, start-time, run-state_ etc. 
While composing an event, one can specify what all _sub-events_ constitute the 
state change. Note that the order in which the _sub-events_ are specified is 
also preserved.

5) For capacity-scheduler,  based on the _sub-events_ constituting the state 
transition, appropriate action is performed. For now the actions are
    - promote a job from the waiting queue to the running queue
    - remove a job upon job completion
    - re-position the job in the queue as the parameters that decide where the 
job is positioned has changed

6) If {{JobStateChangeEvent}} fails to capture all the events then 
{{JobChangeEvent}} can be extended to cater that case.

7) Other listener implementations remain unchanged as they just require 
{{jobInProgress}} which is obtained from {{JobChangeEvent}}.

Tested the patch with capacity scheduler and it works fine. The web-ui doesnt 
show completed jobs in the job queue which means that the job is removed upon 
completion. _test-patch_ and _ant test_ pass on my box. Rest of the listener 
implementations should not be affected.
This patch is meant for 0.19.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch, HADOOP-4053-v2.patch, 
> HADOOP-4053-v3.1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify 
> Schedulers of when jobs are added, removed, or updated. Right now, there is 
> no way for the Scheduler to know that a job has completed. jobRemoved() is 
> called when a job is retired, which can happen many hours after a job is 
> actually completed. jobUpdated() is called when a job's priority is changed. 
> We need to notify a listener when a job has completed (either successfully, 
> or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to