[ https://issues.apache.org/jira/browse/HADOOP-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634460#action_12634460 ]
Steve Loughran commented on HADOOP-4053: ---------------------------------------- My needs aren't so much job scheduling as workflow integration. I'm just listening for job lifecycle events so that I can match that lifecycle in remote code. As of yesterday I have simple MR jobs being deployed against a dynamically instantiated set of hadoop processes, using job.getStatus() to poll the state of the job and detecting success/failure when the job declares itself completed. But already I can see that my tests get into trouble here as they tear down the processes once the job is finished, and I see error messages in the test log complaining that the trackers can't write their its task/job histories as the filesystem has gone down. I need to -consider moving from polling to notifiications to check job state (these would be RMI calls or something similar, hence slow) -wait until the job and task trackers are completely done with processing the jobs before pulling out the results and shutting down the cluster so: no expectation that the base methods do anything, I'm just relaying events to other programs that may or may not care For the queue, I'd have a single queue of job events {{Queue<JobLifecycleEvent> events}} and handle {{{ public void jobCompleted(JobInProgress jip) [ events.add(new JobLifecycleEvent(JobLifecycleEventType.COMPLETED,jip) } }}} then the queue thread would forward these off to whatever remote entity cared. Given that schedulers and other listeners behave differently, I'm now not so sure about a base class. The javadocs for the listener need to make it clear that blocking isn't allowed so that anyone providing a listener knows to do async work if needed. > Schedulers need to know when a job has completed > ------------------------------------------------ > > Key: HADOOP-4053 > URL: https://issues.apache.org/jira/browse/HADOOP-4053 > Project: Hadoop Core > Issue Type: Improvement > Affects Versions: 0.19.0 > Reporter: Vivek Ratan > Assignee: Amar Kamat > Fix For: 0.19.0 > > Attachments: HADOOP-4053-v1.patch > > > The JobInProgressListener interface is used by the framework to notify > Schedulers of when jobs are added, removed, or updated. Right now, there is > no way for the Scheduler to know that a job has completed. jobRemoved() is > called when a job is retired, which can happen many hours after a job is > actually completed. jobUpdated() is called when a job's priority is changed. > We need to notify a listener when a job has completed (either successfully, > or has failed or been killed). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.