> On Nov. 8, 2013, 11:12 p.m., Rohini Palaniswamy wrote: > > src/org/apache/pig/backend/hadoop/executionengine/tez/TezPrinter.java, line > > 57 > > <https://reviews.apache.org/r/15261/diff/3/?file=380865#file380865line57> > > > > Can we print the combine plan after the plan on the vertex for easy > > readability?
Talked to Rohini offline. Currently, plan is printed in the following order: Map vertex pipeline -> Edge combine plan -> Reduce vertex pipeline - Cheolsoo ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15261/#review28595 ----------------------------------------------------------- On Nov. 8, 2013, 7:57 a.m., Cheolsoo Park wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/15261/ > ----------------------------------------------------------- > > (Updated Nov. 8, 2013, 7:57 a.m.) > > > Review request for pig, Alex Bain, Daniel Dai, Mark Wagner, and Rohini > Palaniswamy. > > > Bugs: PIG-3555 > https://issues.apache.org/jira/browse/PIG-3555 > > > Repository: pig-git > > > Description > ------- > > Initial implementation of Tez combiner optimizer. The patch includes the > following changes- > * Factored out CombinerOptimizer code into a utility class called > CombinerOptimizerUtil. So both MR and Tez CombinerOptimizer use this utility > class instead of duplicating code. > * Introduced a new class called TezEdgeDescriptor that holds combine plans as > well as various edge properties. > * Added TezEdgeDescriptors to TezOperator. Note that I added multiple > descriptors for inbound edges but a single descriptor for all the outbound > edges. This is because TezDagBuilder always creates an edge by connecting > predecessors to the current vertex. Please let me know if you think we should > allow multiple descriptors for outbound edges too. > * Refactored some code in TezDagBuilder while touching it. > > > Diffs > ----- > > src/org/apache/pig/PigServer.java c0826ea > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java > 18a382b > > src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRExecType.java > 07d737d > > src/org/apache/pig/backend/hadoop/executionengine/tez/CombinerOptimizer.java > e69de29 > src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java > 0b1f3c9 > src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java > 45e47b0 > > src/org/apache/pig/backend/hadoop/executionengine/tez/TezEdgeDescriptor.java > e69de29 > src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java > 3f14644 > src/org/apache/pig/backend/hadoop/executionengine/tez/TezPrinter.java > 5a42ded > > src/org/apache/pig/backend/hadoop/executionengine/util/CombinerOptimizerUtil.java > e69de29 > test/org/apache/pig/test/TestCombiner.java 6252b51 > test/org/apache/pig/test/data/GoldenFiles/TEZC1.gld 925f07e > test/org/apache/pig/test/data/GoldenFiles/TEZC2.gld a3974fe > test/org/apache/pig/test/data/GoldenFiles/TEZC3.gld a8c942b > test/org/apache/pig/test/data/GoldenFiles/TEZC4.gld fb7c903 > test/org/apache/pig/test/data/GoldenFiles/TEZC5.gld e6cd25e > test/tez-tests c22a448 > > Diff: https://reviews.apache.org/r/15261/diff/ > > > Testing > ------- > > ant test-tez passes. > ant test-e2e-tez passes. > > I didn't add new test cases, but an e2e test case (Checkin_3) includes an > algebraic udf (count) following group-by. I also manually tested it on a live > cluster. > > > Thanks, > > Cheolsoo Park > >
