----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24919/#review51222 -----------------------------------------------------------
Hi Szehon! Thank you very much for this patch! It will enable us to pull in some of the join work! I am also looking at this patch as an opportunity to clean a few items up. Do you mind? ql/src/java/org/apache/hadoop/hive/ql/exec/spark/GraphTran.java <https://reviews.apache.org/r/24919/#comment89301> while you are in here can you add final to these members? ql/src/java/org/apache/hadoop/hive/ql/exec/spark/GraphTran.java <https://reviews.apache.org/r/24919/#comment89297> Should this be IllegalStateException? Perhaps "Connection from " + parent + " to " + child already exists ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java <https://reviews.apache.org/r/24919/#comment89294> nit final ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java <https://reviews.apache.org/r/24919/#comment89308> I am kind of stupid...would you mind adding more on why we need to have the UT in that location? ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java <https://reviews.apache.org/r/24919/#comment89295> nit if ( not null) else is more simple when the two blocks are reversed if null else ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java <https://reviews.apache.org/r/24919/#comment89296> same if not null issue here ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java <https://reviews.apache.org/r/24919/#comment89291> Looks like this is on one line? ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java <https://reviews.apache.org/r/24919/#comment89292> I know you didn't write this method but can we remove it? Hi S - Brock Noland On Aug. 21, 2014, 2:11 a.m., Szehon Ho wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/24919/ > ----------------------------------------------------------- > > (Updated Aug. 21, 2014, 2:11 a.m.) > > > Review request for hive and Brock Noland. > > > Bugs: HIVE-7815 > https://issues.apache.org/jira/browse/HIVE-7815 > > > Repository: hive-git > > > Description > ------- > > This is the first part of the reduce-side join work. See HIVE-7384 for the > overall design doc. > > This patch inserts a UnionTran after the two join inputs, and thus leverages > the Union-all code path to run the Spark RDD. I also made the following > changes: > > 1. Some API cleanup of GraphTran. Connect will automatically add the child, > so no need for multiple calls. > 2. Fix a bug in HiveBaseReduceFunction. HIVE-7652 made the iterator return > false after close if there's more rows, so Spark calls hasNext again and > close thus gets called twice. CommonJoinOperator throws exception if close > gets called more than once. So adding a check there. > > > Diffs > ----- > > itests/src/test/resources/testconfiguration.properties 63af01d > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/GraphTran.java 03f0ff8 > > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/HiveBaseFunctionResultList.java > 6568a76 > ql/src/java/org/apache/hadoop/hive/ql/exec/spark/SparkPlanGenerator.java > d16f1be > ql/src/test/results/clientpositive/spark/join0.q.out PRE-CREATION > ql/src/test/results/clientpositive/spark/join1.q.out PRE-CREATION > ql/src/test/results/clientpositive/spark/join_casesensitive.q.out > PRE-CREATION > > Diff: https://reviews.apache.org/r/24919/diff/ > > > Testing > ------- > > Added three join tests to the TestSparkCliDriver suite. > > > Thanks, > > Szehon Ho > >