[ 
https://issues.apache.org/jira/browse/PIG-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12887602#action_12887602
 ] 

Richard Ding commented on PIG-1478:
-----------------------------------

bq. I don't understand the difference between launchStartedNotification() and 
jobsSubmittedNotification().

launchStartedNotification() tells the listeners the total number of jobs ready 
to submit for the script. jobsSubmittedNotification() tells the listeners the 
number of jobs submitted in a batch. Because of the dependency between jobs, 
Pig may not be able to submit all the jobs together. So the numJobsToLaunch 
passed to launchStartedNotification() should equal to the sum of 
numJobsSubmitted of all  jobsSubmittedNotification() calls.

bq. When will outputCompletedNotification() be called? Only after the job is 
completely done? What, if any, guarantees are we making on the order of this 
relative to when PigRunner.run returns?

outputCompletedNotification() is called after the job that writes this output 
is done. This is only called for user outputs. As a script can have multiple 
user outputs, some outputs may be written before all jobs are done. 

bq. It isn't clear to me that launchCompleteNotification() is useful. Once the 
launch has completed the user will start getting jobStartedNotification() calls.

Just try to be complete. launchCompleteNotification() is called when all jobs 
are done. If a script is executed successfully, the numJobsSucceeded should 
equal to the  numJobsToLaunch from launchStartedNotification().

An example log trace looks like this:

{code}
---- numJobsToLaunch: 3
---- jobs submitted: 1
---- progress: 0%
---- job started: job_20100702195434153_0002
---- progress: 16%
---- progress: 33%
---- job finished: job_20100702195434153_0002
---- jobs submitted: 1
---- job started: job_20100702195434153_0003
---- progress: 50%
---- progress: 66%
---- job finished: job_20100702195434153_0003
---- jobs submitted: 1
---- job started: job_20100702195434153_0004
---- progress: 83%
---- output done: hdfs://localhost.localdomain:52083/user/pig/myoutput
---- job finished: job_20100702195434153_0004
---- progress: 100%
---- numJobsSucceeded: 3
{code}

> Add progress notification listener to PigRunner API
> ---------------------------------------------------
>
>                 Key: PIG-1478
>                 URL: https://issues.apache.org/jira/browse/PIG-1478
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Richard Ding
>            Assignee: Richard Ding
>             Fix For: 0.8.0
>
>         Attachments: PIG-1478.patch
>
>
> PIG-1333 added PigRunner API to allow Pig users and tools to get a 
> status/stats object back after executing a Pig script. The new API, however, 
> is synchronous (blocking). It's known that a Pig script can spawn tens (even 
> hundreds) MR jobs and take hours to complete. Therefore it'll be nice to give 
> progress feedback to the callers during the execution.
> The proposal is to add an optional parameter to the API:
> {code}
> public abstract class PigRunner {
>     public static PigStats run(String[] args, PigProgressNotificationListener 
> listener) {...}
> }
> {code} 
> The new listener is defined as following:
> {code}
> package org.apache.pig.tools.pigstats;
> public interface PigProgressNotificationListener extends 
> java.util.EventListener {
>     // just before the launch of MR jobs for the script
>     public void LaunchStartedNotification(int numJobsToLaunch);
>     // number of jobs submitted in a batch
>     public void jobsSubmittedNotification(int numJobsSubmitted);
>     // a job is started
>     public void jobStartedNotification(String assignedJobId);
>     // a job is completed successfully
>     public void jobFinishedNotification(JobStats jobStats);
>     // a job is failed
>     public void jobFailedNotification(JobStats jobStats);
>     // a user output is completed successfully
>     public void outputCompletedNotification(OutputStats outputStats);
>     // updates the progress as percentage
>     public void progressUpdatedNotification(int progress);
>     // the script execution is done
>     public void launchCompletedNotification(int numJobsSucceeded);
> }
> {code}
> Any thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to