[ 
https://issues.apache.org/jira/browse/TEZ-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977417#comment-13977417
 ] 

Bikas Saha commented on TEZ-1062:
---------------------------------

Looks good overall.

This needs to be in the tez-runtime-library project under new package 
org.apache.tez.runtime.library.processor

In general either input or outputs could be null.{code}+    
Preconditions.checkNotNull(inputs, "inputs can't be null");
+    Preconditions.checkNotNull(outputs, "ouputs can't be null");{code} This 
means there we will need null checks in other places.

How about following exposing the inputs and outputs via getters and renaming 
this to run()?
{code}+  public abstract void execute(Map<String, LogicalInput> inputs, 
Map<String, LogicalOutput> outputs)
+      throws Exception;{code}

I think it makes sense to move this code into the postOp of SimpleProcessor.
Secondly, we should call getContext().canCommit() only once. Sorry, the code in 
original UnionExample is wrong. So we need to check if commit is required. If 
yes, then get permission from context, then commit all outputs that need 
commit. If any output fails to commit then we should abort all the outputs that 
needed commit.
{code}+    protected void postOp(Map<String, LogicalInput> inputs, Map<String, 
LogicalOutput> outputs)
+        throws Exception {
+      for (LogicalOutput output : outputs.values()) {
+        if ((output instanceof MROutput) && (((MROutput) 
output).isCommitRequired())) {
+          while (!getContext().canCommit()) {
+            Thread.sleep(100);
+          }
+          ((MROutput) output).commit();
+        }
+      }
{code}

There are 3 pure Tez examples now, WordCount, UnionExample and 
BroadcastAndOneToOneExample. We should change all of them to use the 
SimpleProcessor where it makes sense.

> Create SimpleProcessor for processors that only need to implement the run 
> method
> --------------------------------------------------------------------------------
>
>                 Key: TEZ-1062
>                 URL: https://issues.apache.org/jira/browse/TEZ-1062
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Mohammad Kamrul Islam
>         Attachments: TEZ-1062.1.patch
>
>
> The SimpleProcessor could take care of all things like starting input, 
> committing outputs. It would handle no events, since simple processors dont 
> need to handle inputs. Thus the user would only need to implement their 
> custom task logic in a new execute() method.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to