[
https://issues.apache.org/jira/browse/HIVE-25316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17379060#comment-17379060
]
Krisztian Kasa commented on HIVE-25316:
---------------------------------------
Wrong plan is generated after pushing TopNKey operator through Select operator.
{code}
TableScan
alias: store_sales
properties:
hive.sql.query SELECT "ss_store_sk", SUM("ss_net_profit")
AS "$f1"
FROM "STORE_SALES"
GROUP BY "ss_store_sk"
hive.sql.query.fieldNames ss_store_sk,$f1
hive.sql.query.fieldTypes int,decimal(17,2)
hive.sql.query.split false
Statistics: Num rows: 1 Data size: 116 Basic stats: COMPLETE
Column stats: NONE
Top N Key Operator
sort order: ++
keys: ss_store_sk (type: int), $f1 (type: decimal(17,2))
null sort order: az
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 1 Data size: 116 Basic stats:
COMPLETE Column stats: NONE
top n: 6
Select Operator
expressions: ss_store_sk (type: int), $f1 (type:
decimal(17,2))
outputColumnNames: _col0, _col1
{code}
Map-reduce partition columns contains an invalid column reference. It should
reference columns by their names if the parent is TableScan like in the *keys*
property.
{code}
Map-reduce partition columns: _col0 (type: int)
{code}
> Map partition key columns when pushing TNK op through select.
> -------------------------------------------------------------
>
> Key: HIVE-25316
> URL: https://issues.apache.org/jira/browse/HIVE-25316
> Project: Hive
> Issue Type: Bug
> Components: JDBC storage handler, Query Processor
> Affects Versions: 4.0.0
> Reporter: Stamatis Zampetakis
> Assignee: Krisztian Kasa
> Priority: Major
> Attachments: external_jdbc_table_perf2.q
>
>
> The following TPC-DS query fails at runtime when the table {{store_sales}} is
> an external JDBC table.
> {code:sql}
> SELECT ranking
> FROM
> (SELECT rank() OVER (PARTITION BY ss_store_sk
> ORDER BY sum(ss_net_profit)) AS ranking
> FROM store_sales
> GROUP BY ss_store_sk) tmp1
> WHERE ranking <= 5
> {code}
> The stacktrace below shows that problem occurs while trying to initialize the
> {{TopNKeyOperator}}.
> {noformat}
> 2021-07-08T09:04:37,444 ERROR [TezTR-270335_1_3_0_0_0] tez.TezProcessor:
> Failed initializeAndRunProcessor
> java.lang.RuntimeException: Map operator initialization failed
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:351)
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:310)
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:277)
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381)
> [tez-runtime-internals-0.10.0.jar:0.10.0]
> at
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
> [tez-runtime-internals-0.10.0.jar:0.10.0]
> at
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
> [tez-runtime-internals-0.10.0.jar:0.10.0]
> at java.security.AccessController.doPrivileged(Native Method)
> ~[?:1.8.0_261]
> at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_261]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
> [hadoop-common-3.1.0.jar:?]
> at
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
> [tez-runtime-internals-0.10.0.jar:0.10.0]
> at
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
> [tez-runtime-internals-0.10.0.jar:0.10.0]
> at
> org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> [tez-common-0.10.0.jar:0.10.0]
> at
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
> [hive-llap-server-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [?:1.8.0_261]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [?:1.8.0_261]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [?:1.8.0_261]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]
> Caused by: java.lang.RuntimeException: cannot find field _col0 from
> [0:ss_store_sk, 1:$f1]
> at
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:550)
> ~[hive-serde-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:153)
> ~[hive-serde-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:56)
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.TopNKeyOperator.initObjectInspectors(TopNKeyOperator.java:101)
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.TopNKeyOperator.initializeOp(TopNKeyOperator.java:82)
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360)
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549)
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503)
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369)
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeMapOperator(MapOperator.java:506)
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:314)
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> ... 16 more
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)