> On April 13, 2015, 3:50 a.m., Rohini Palaniswamy wrote: > > - What about Cross parallelism? > > - What about updating findquantiles constant for sampling? > > - Can we have a config to turn off grace parallelism if there are issues > > and fall back to the previous auto parallelism?
The cross parallelism is an optimization. As long as GFCross use the same parallelism in different vertexes, we still get the right result. And grace parallelism does not change that. Order by / skewed join is using PartitionerDefinedVertexManager. It is not affected by grace parallelism Adding a flag pig.tez.grace.parallelism for that. > On April 13, 2015, 3:50 a.m., Rohini Palaniswamy wrote: > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java, > > line 440 > > <https://reviews.apache.org/r/32492/diff/1/?file=905799#file905799line440> > > > > Why should the datamovement type be null? Can you add a comment > > explaining why Put datamovement to null to prevent vertex "to" from starting. It will be started by PigGraceShuffleVertexManager > On April 13, 2015, 3:50 a.m., Rohini Palaniswamy wrote: > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezOperator.java, > > line 510 > > <https://reviews.apache.org/r/32492/diff/1/?file=905802#file905802line510> > > > > Why would plan be null? TezOperator.plan is marked transient. It will be null in the backend (PigGraceShuffleVertexManager) > On April 13, 2015, 3:50 a.m., Rohini Palaniswamy wrote: > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/ParallelismSetter.java, > > line 164 > > <https://reviews.apache.org/r/32492/diff/1/?file=905804#file905804line164> > > > > Is this the case of vertex group for sample aggregation? In that case > > shouldn't we update the successor of the vertex group? Same as before: TezOperator.plan is marked transient. It will be null in the backend (PigGraceShuffleVertexManager) > On April 13, 2015, 3:50 a.m., Rohini Palaniswamy wrote: > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/ParallelismSetter.java, > > line 110 > > <https://reviews.apache.org/r/32492/diff/1/?file=905804#file905804line110> > > > > Why do we need this check with 1-1 parallelism? If tezOp.estimatedParallelism already set, don't override. PigGraceShuffleVertexManager set the estimated parallelism according to the output data size of the node. Add comments. - Daniel ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32492/#review79498 ----------------------------------------------------------- On April 12, 2015, 3:51 p.m., Daniel Dai wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/32492/ > ----------------------------------------------------------- > > (Updated April 12, 2015, 3:51 p.m.) > > > Review request for pig and Rohini Palaniswamy. > > > Repository: pig > > > Description > ------- > > See PIG-4434 > > > Diffs > ----- > > trunk/ivy/libraries.properties 1672805 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java > 1672805 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezEdgeDescriptor.java > 1672805 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezOperPlan.java > 1672805 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezOperator.java > 1672805 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/LoaderProcessor.java > 1672805 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/ParallelismSetter.java > 1672805 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/TezEstimatedParallelismClearer.java > PRE-CREATION > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/TezOperDependencyParallelismEstimator.java > 1672805 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/runtime/PartitionerDefinedVertexManager.java > 1672805 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/runtime/PigGraceShuffleVertexManager.java > PRE-CREATION > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java > 1672805 > trunk/src/org/apache/pig/impl/plan/OperatorKey.java 1672805 > > trunk/test/org/apache/pig/newplan/logical/optimizer/TestImplicitSplitOnTuple.java > 1672805 > trunk/test/org/apache/pig/test/Util.java 1672805 > trunk/test/org/apache/pig/tez/TestTezGraceParallelism.java PRE-CREATION > trunk/test/org/apache/pig/tez/TestTezJobControlCompiler.java 1672805 > trunk/test/org/apache/pig/tez/TestTezLauncher.java 1672805 > > Diff: https://reviews.apache.org/r/32492/diff/ > > > Testing > ------- > > All unit test pass pending (TEZ-2310), which is a Tez 0.7.0 regression not > related to the patch. > > > Thanks, > > Daniel Dai > >
