----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19633/#review38532 -----------------------------------------------------------
This will not handle union followed by store right? Can we create a separate jira for that? src/org/apache/pig/backend/hadoop/executionengine/tez/POVertexGroupInputTez.java <https://reviews.apache.org/r/19633/#comment70769> Don't we have to iterate through all the values? Won't there be more than one if shuffle edge is used and input grouped? src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java <https://reviews.apache.org/r/19633/#comment70770> Can you please add a TODO to use POValueOutputTez instead of POLocalRearrange and unsorted shuffle with TEZ-661 and PIG-3775 src/org/apache/pig/backend/hadoop/executionengine/tez/TezPrinter.java <https://reviews.apache.org/r/19633/#comment70772> hasVertexGroup() - Rohini Palaniswamy On March 25, 2014, 8:50 p.m., Cheolsoo Park wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/19633/ > ----------------------------------------------------------- > > (Updated March 25, 2014, 8:50 p.m.) > > > Review request for pig, Daniel Dai and Rohini Palaniswamy. > > > Bugs: PIG-3743 > https://issues.apache.org/jira/browse/PIG-3743 > > > Repository: pig-git > > > Description > ------- > > The patch reimplements union using Tez VertexGroup and GroupInputEdge. > > The changes include- > * Implemented POVertexGroupInputTez that takes > ConcatenatedMergedKeyValuesInput from VertexGroup. > * TezCompiler inserts an alias vertex for union, and the alias vertex is > converted to VertexGroup by TezDagBuilder. > * TezStats JobGraphBuilder removes alias vertices since they're not > materialized by Tez, and thus, there is no status for them. > > Note that- > * Further optimization is possible for the case where union is only followed > by store. In that case, we could directly attach a MROutput to VertexGroup > instead of adding another vertex that runs the MROutput. I'll follow up with > this soon. > * POLocalRearrangeTez is added to each union source because > ConcatenatedMergedKeyValuesInput expected ShuffledMergedInputs that requires > sorting. > > > Diffs > ----- > > src/org/apache/pig/backend/hadoop/executionengine/tez/POUnionTezLoad.java > e496ca8 > > src/org/apache/pig/backend/hadoop/executionengine/tez/POVertexGroupInputTez.java > e69de29 > src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java > 245cade > src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java > bce8963 > src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java > b3aa020 > src/org/apache/pig/backend/hadoop/executionengine/tez/TezPrinter.java > f00946c > src/org/apache/pig/tools/pigstats/tez/TezStats.java feac11d > test/org/apache/pig/test/data/GoldenFiles/TEZC19.gld e69de29 > test/org/apache/pig/tez/TestTezCompiler.java e71d838 > > Diff: https://reviews.apache.org/r/19633/diff/ > > > Testing > ------- > > ant test-tez passes except TestTezCompiler (known issue). > tez e2e tests all pass. > > > Thanks, > > Cheolsoo Park > >
