[ 
https://issues.apache.org/jira/browse/HADOOP-1111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12480560
 ] 

Tom White commented on HADOOP-1111:
-----------------------------------

Alejandro - I don't think you're missing anything. The Hadoop retry framework 
is currently only appropriate for synchronous retries, whereas this patch uses 
asynchronous retries to avoid tying up the JobTracker.

Stepping back a moment, though, I wonder why we need retries for job 
notification - could we start out simple and add them later? If not, could we 
look at the retry mechanisms that HttpClient provides 
(http://jakarta.apache.org/commons/httpclient/exception-handling.html). Or 
perhaps  synchronous retries (using the Hadoop retry framework) from a 
LinkedBlockingQueue (as David suggests) would do.

Why is there a special case for local runner notification? Could it not use the 
same mechanism as the more general case? (Also, I think the code in 
localRunnerNotification may go into an infinite loop if exceptions keep 
occurring, since I can't see where the number of retries is decremented.)



> Job completion notification to a job configured URL
> ---------------------------------------------------
>
>                 Key: HADOOP-1111
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1111
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.12.0
>         Environment: all
>            Reporter: Alejandro Abdelnur
>         Attachments: patch-1111.txt, patch-1111.txt
>
>
> Currently clients have to poll the JobTracker to find if a job has completed 
> or not.
> When invoking Hadoop from other systems is desirable to have a notification 
> mechanism on job completion. 
> The notification approach simplifies the client waiting for completion and 
> removes load from the JobTracker as polling can be avoided. 
> Proposed solution:
> When the JobTracker processes the completion of a job (success and failure)  
> if the job configuration has a jobEnd.notificationUrl property it will make a 
> HTTP GET request to the specified URL.
> The jobEnd.notificationUrl property may include 2 variables in it '${jobId}' 
> and '${jobStatus}'. if they are present, they will be replaced with tehe job 
> ID and status of the job and the URL will be invoked.
> Two additional properties, 'jobEnd.retries' and 'jobEnd.retryInterval', will 
> indicate retry behavior.
> Not to delay the JobTracker processing while doing notifications, a 
> ConsumerProducer Queue will be used to queue up job notification upon 
> completion.
> A daemon thread will consume job notifications from the above Queue and will 
> make the URL invocation. 
> On notification failure, the job notification is  queue up again on the 
> notification queue.
> The queue will be a java.util.concurrent.DelayQueue. This will make job 
> notifications (on retries) to be avaiable on the consumer side only when the 
> retry time is up.
> The changes will be done in the JobTracker and in the LocalJobRunner.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to