[ 
https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634460#action_12634460
 ] 

Steve Loughran commented on HADOOP-4053:
----------------------------------------

My needs aren't so much job scheduling as workflow integration. I'm just 
listening for job lifecycle events so that I can match that lifecycle in remote 
code. As of yesterday I have simple MR jobs being deployed against a 
dynamically instantiated set of hadoop processes, using job.getStatus() to poll 
the state of the job and detecting success/failure when the job declares itself 
completed. But already I can see that my tests get into trouble here as they 
tear down the processes once the job is finished, and I see error messages in 
the test log complaining that the trackers can't write their its task/job 
histories as the filesystem has gone down. I need to 
 -consider moving from polling to notifiications to check job state (these 
would be RMI calls or something similar, hence slow)
 -wait until the job and task trackers are completely done with processing the 
jobs before pulling out the results and shutting down the cluster

so: no expectation that the base methods do anything, I'm just relaying events 
to other programs that may or may not care

For the queue, I'd have a single queue of job events {{Queue<JobLifecycleEvent> 
events}} and handle
{{{
  public void jobCompleted(JobInProgress jip) [
    events.add(new JobLifecycleEvent(JobLifecycleEventType.COMPLETED,jip)
  }
}}} then the queue thread would forward these off to whatever remote entity 
cared. 

Given that schedulers and other listeners behave differently, I'm now not so 
sure about a base class. The javadocs for the listener need to make it clear 
that blocking isn't allowed so that anyone providing a listener knows to do 
async work if needed.

> Schedulers need to know when a job has completed
> ------------------------------------------------
>
>                 Key: HADOOP-4053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4053
>             Project: Hadoop Core
>          Issue Type: Improvement
>    Affects Versions: 0.19.0
>            Reporter: Vivek Ratan
>            Assignee: Amar Kamat
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-4053-v1.patch
>
>
> The JobInProgressListener interface is used by the framework to notify 
> Schedulers of when jobs are added, removed, or updated. Right now, there is 
> no way for the Scheduler to know that a job has completed. jobRemoved() is 
> called when a job is retired, which can happen many hours after a job is 
> actually completed. jobUpdated() is called when a job's priority is changed. 
> We need to notify a listener when a job has completed (either successfully, 
> or has failed or been killed). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to