----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/65174/ -----------------------------------------------------------
(Updated 5 31, 2018, 2:47 오전) Review request for hive. Changes ------- I rebased the patch on the latest master branch. Bugs: HIVE-17896 https://issues.apache.org/jira/browse/HIVE-17896 Repository: hive-git Description ------- For TPC-DS Query27, the TopN operation is delayed by the group-by - the group-by operator buffers up all the rows before discarding the 99% of the rows in the TopN Hash within the ReduceSink Operator. The RS TopN operator is very restrictive as it only supports doing the filtering on the shuffle keys, but it is better to do this before breaking the vectors into rows and losing the isRepeating properties. Adding a TopN Key operator in the physical operator tree allows the following to happen. GBY->RS(Top=1) can become TNK(1)->GBY->RS(Top=1) So that, the TopNKey can remove rows before they are buffered into the GBY and consume memory. Here's the equivalent implementation in Presto https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35 Adding this as a sub-feature of GroupBy prevents further optimizations if the GBY is on keys "a,b,c" and the TopNKey is on just "a". Diffs (updated) ----- common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3295d1dbc5 itests/src/test/resources/testconfiguration.properties 6a70a4a6bd ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java a002348013 ql/src/java/org/apache/hadoop/hive/ql/exec/KeyWrapperFactory.java 3c7f0b78c2 ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 7bb6590d5e ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorTopNKeyOperator.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/TopNKeyProcessor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/TopNKeyPushdown.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java 394f826508 ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java dfd790853b ql/src/java/org/apache/hadoop/hive/ql/plan/TopNKeyDesc.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/plan/VectorTopNKeyDesc.java PRE-CREATION ql/src/test/queries/clientpositive/topnkey.q PRE-CREATION ql/src/test/queries/clientpositive/vector_topnkey.q PRE-CREATION ql/src/test/results/clientpositive/llap/topnkey.q.out PRE-CREATION ql/src/test/results/clientpositive/llap/vector_topnkey.q.out PRE-CREATION ql/src/test/results/clientpositive/tez/topnkey.q.out PRE-CREATION ql/src/test/results/clientpositive/tez/vector_topnkey.q.out PRE-CREATION ql/src/test/results/clientpositive/topnkey.q.out PRE-CREATION ql/src/test/results/clientpositive/vector_topnkey.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java a442cb1228 Diff: https://reviews.apache.org/r/65174/diff/2/ Changes: https://reviews.apache.org/r/65174/diff/1-2/ Testing ------- Thanks, Teddy Choi