[ https://issues.apache.org/jira/browse/TEZ-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977417#comment-13977417 ]
Bikas Saha commented on TEZ-1062: --------------------------------- Looks good overall. This needs to be in the tez-runtime-library project under new package org.apache.tez.runtime.library.processor In general either input or outputs could be null.{code}+ Preconditions.checkNotNull(inputs, "inputs can't be null"); + Preconditions.checkNotNull(outputs, "ouputs can't be null");{code} This means there we will need null checks in other places. How about following exposing the inputs and outputs via getters and renaming this to run()? {code}+ public abstract void execute(Map<String, LogicalInput> inputs, Map<String, LogicalOutput> outputs) + throws Exception;{code} I think it makes sense to move this code into the postOp of SimpleProcessor. Secondly, we should call getContext().canCommit() only once. Sorry, the code in original UnionExample is wrong. So we need to check if commit is required. If yes, then get permission from context, then commit all outputs that need commit. If any output fails to commit then we should abort all the outputs that needed commit. {code}+ protected void postOp(Map<String, LogicalInput> inputs, Map<String, LogicalOutput> outputs) + throws Exception { + for (LogicalOutput output : outputs.values()) { + if ((output instanceof MROutput) && (((MROutput) output).isCommitRequired())) { + while (!getContext().canCommit()) { + Thread.sleep(100); + } + ((MROutput) output).commit(); + } + } {code} There are 3 pure Tez examples now, WordCount, UnionExample and BroadcastAndOneToOneExample. We should change all of them to use the SimpleProcessor where it makes sense. > Create SimpleProcessor for processors that only need to implement the run > method > -------------------------------------------------------------------------------- > > Key: TEZ-1062 > URL: https://issues.apache.org/jira/browse/TEZ-1062 > Project: Apache Tez > Issue Type: Sub-task > Reporter: Bikas Saha > Assignee: Mohammad Kamrul Islam > Attachments: TEZ-1062.1.patch > > > The SimpleProcessor could take care of all things like starting input, > committing outputs. It would handle no events, since simple processors dont > need to handle inputs. Thus the user would only need to implement their > custom task logic in a new execute() method. -- This message was sent by Atlassian JIRA (v6.2#6252)