[
https://issues.apache.org/jira/browse/HIVE-17896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16308397#comment-16308397
]
Hive QA commented on HIVE-17896:
--------------------------------
Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12904230/HIVE-17896.4.patch
{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.
{color:red}ERROR:{color} -1 due to 20 failed/errored test(s), 11546 tests
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2]
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[lateral_view_ppd]
(batchId=87)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
(batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
(batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
(batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid]
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb]
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_reduce_groupby_duplicate_cols]
(batchId=158)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
(batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5]
(batchId=120)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut
(batchId=208)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
(batchId=213)
org.apache.hadoop.hive.ql.TestAcidOnTez.testMapJoinOnTez (batchId=222)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
{noformat}
Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8396/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8396/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8396/
Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 20 tests failed
{noformat}
This message is automatically generated.
ATTACHMENT ID: 12904230 - PreCommit-HIVE-Build
> TopNKey: Create a standalone vectorizable TopNKey operator
> ----------------------------------------------------------
>
> Key: HIVE-17896
> URL: https://issues.apache.org/jira/browse/HIVE-17896
> Project: Hive
> Issue Type: New Feature
> Components: Operators
> Affects Versions: 3.0.0
> Reporter: Gopal V
> Assignee: Teddy Choi
> Attachments: HIVE-17896.1.patch, HIVE-17896.3.patch,
> HIVE-17896.4.patch
>
>
> For TPC-DS Query27, the TopN operation is delayed by the group-by - the
> group-by operator buffers up all the rows before discarding the 99% of the
> rows in the TopN Hash within the ReduceSink Operator.
> The RS TopN operator is very restrictive as it only supports doing the
> filtering on the shuffle keys, but it is better to do this before breaking
> the vectors into rows and losing the isRepeating properties.
> Adding a TopN Key operator in the physical operator tree allows the following
> to happen.
> GBY->RS(Top=1)
> can become
> TNK(1)->GBY->RS(Top=1)
> So that, the TopNKey can remove rows before they are buffered into the GBY
> and consume memory.
> Here's the equivalent implementation in Presto
> https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35
> Adding this as a sub-feature of GroupBy prevents further optimizations if the
> GBY is on keys "a,b,c" and the TopNKey is on just "a".
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)