> On March 25, 2014, 9:27 p.m., Rohini Palaniswamy wrote: > > This will not handle union followed by store right? Can we create a > > separate jira for that?
Correct. Like I said in the description, MROutput should be added directly to VertexGroup. I will file a jira. > On March 25, 2014, 9:27 p.m., Rohini Palaniswamy wrote: > > src/org/apache/pig/backend/hadoop/executionengine/tez/POVertexGroupInputTez.java, > > lines 82-83 > > <https://reviews.apache.org/r/19633/diff/1/?file=535690#file535690line82> > > > > Don't we have to iterate through all the values? Won't there be more > > than one if shuffle edge is used and input grouped? No, I don't think so. It should be handled by ConcatenatedMergedKeyValuesInput. If I am understanding it correctly, calling reader.next() updates the underlying reader. So the following reader.getCurrentValues().iterator() returns the new iterator if the previous reader is done. In fact, I can confirm the result by testing union with multiple sources, e.g. 3+. It seems correct. > On March 25, 2014, 9:27 p.m., Rohini Palaniswamy wrote: > > src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java, > > line 1979 > > <https://reviews.apache.org/r/19633/diff/1/?file=535691#file535691line1979> > > > > Can you please add a TODO to use POValueOutputTez instead of > > POLocalRearrange and unsorted shuffle with TEZ-661 and PIG-3775 Sure. > On March 25, 2014, 9:27 p.m., Rohini Palaniswamy wrote: > > src/org/apache/pig/backend/hadoop/executionengine/tez/TezPrinter.java, line > > 54 > > <https://reviews.apache.org/r/19633/diff/1/?file=535694#file535694line54> > > > > hasVertexGroup() I 100% agree hasVertexGroup() reads better, but since explain doesn't go through TezDagBuilder, VertexGroup of tezOper is not set when TezPrinter traverses the plan. Maybe I consolidate hasVertexGroup() and isUnion() into one, and then use it everywhere. - Cheolsoo ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19633/#review38532 ----------------------------------------------------------- On March 25, 2014, 8:50 p.m., Cheolsoo Park wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/19633/ > ----------------------------------------------------------- > > (Updated March 25, 2014, 8:50 p.m.) > > > Review request for pig, Daniel Dai and Rohini Palaniswamy. > > > Bugs: PIG-3743 > https://issues.apache.org/jira/browse/PIG-3743 > > > Repository: pig-git > > > Description > ------- > > The patch reimplements union using Tez VertexGroup and GroupInputEdge. > > The changes include- > * Implemented POVertexGroupInputTez that takes > ConcatenatedMergedKeyValuesInput from VertexGroup. > * TezCompiler inserts an alias vertex for union, and the alias vertex is > converted to VertexGroup by TezDagBuilder. > * TezStats JobGraphBuilder removes alias vertices since they're not > materialized by Tez, and thus, there is no status for them. > > Note that- > * Further optimization is possible for the case where union is only followed > by store. In that case, we could directly attach a MROutput to VertexGroup > instead of adding another vertex that runs the MROutput. I'll follow up with > this soon. > * POLocalRearrangeTez is added to each union source because > ConcatenatedMergedKeyValuesInput expected ShuffledMergedInputs that requires > sorting. > > > Diffs > ----- > > src/org/apache/pig/backend/hadoop/executionengine/tez/POUnionTezLoad.java > e496ca8 > > src/org/apache/pig/backend/hadoop/executionengine/tez/POVertexGroupInputTez.java > e69de29 > src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java > 245cade > src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java > bce8963 > src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java > b3aa020 > src/org/apache/pig/backend/hadoop/executionengine/tez/TezPrinter.java > f00946c > src/org/apache/pig/tools/pigstats/tez/TezStats.java feac11d > test/org/apache/pig/test/data/GoldenFiles/TEZC19.gld e69de29 > test/org/apache/pig/tez/TestTezCompiler.java e71d838 > > Diff: https://reviews.apache.org/r/19633/diff/ > > > Testing > ------- > > ant test-tez passes except TestTezCompiler (known issue). > tez e2e tests all pass. > > > Thanks, > > Cheolsoo Park > >
