[
https://issues.apache.org/jira/browse/HIVE-18148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16285468#comment-16285468
]
liyunzhang commented on HIVE-18148:
-----------------------------------
[~lirui]: I can not reproduce because you did not provide all the script to
reproduce it.
But I guess following script matches you mentioned
[spark_dynamic_partition_pruning.q#https://github.com/kellyzly/hive/blob/master/ql/src/test/queries/clientpositive/spark_dynamic_partition_pruning.q#L29]
{code}
EXPLAIN select count(*) from srcpart join srcpart_date on (srcpart.ds =
srcpart_date.ds) join srcpart_hour on (srcpart.hr = srcpart_hour.hr)
where srcpart_date.`date` = '2008-04-08' and srcpart_hour.hour = 11;
{code}
srcpart is similar as src , srcpart_data is similar as part1 and srcpart_hour
is similiar as part2 in your example. But the operator tree is like
{code}
TS[0]-SEL[2]-MAPJOIN[36]-MAPJOIN[35]-GBY[16]-RS[17]-GBY[18]-FS[20]
TS[3]-FIL[27]-SEL[5]-RS[10]-MAPJOIN[36]
-SEL[29]-GBY[30]-SPARKPRUNINGSINK[31]
TS[6]-FIL[28]-SEL[8]-RS[13]-MAPJOIN[35]
-SEL[32]-GBY[33]-SPARKPRUNINGSINK[34]
{code}
Here TS\[3\] is srcpart_date ,TS\[6\] is srcpart_hour, TS\[0\] is src. But
there is no nested DPP problem here. So where is wrong?
> NPE in SparkDynamicPartitionPruningResolver
> -------------------------------------------
>
> Key: HIVE-18148
> URL: https://issues.apache.org/jira/browse/HIVE-18148
> Project: Hive
> Issue Type: Bug
> Components: Spark
> Reporter: Rui Li
> Assignee: Rui Li
>
> The stack trace is:
> {noformat}
> 2017-11-27T10:32:38,752 ERROR [e6c8aab5-ddd2-461d-b185-a7597c3e7519 main]
> ql.Driver: FAILED: NullPointerException null
> java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.optimizer.physical.SparkDynamicPartitionPruningResolver$SparkDynamicPartitionPruningDispatcher.dispatch(SparkDynamicPartitionPruningResolver.java:100)
> at
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
> at
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
> at
> org.apache.hadoop.hive.ql.optimizer.physical.SparkDynamicPartitionPruningResolver.resolve(SparkDynamicPartitionPruningResolver.java:74)
> at
> org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeTaskPlan(SparkCompiler.java:568)
> {noformat}
> At this stage, there shouldn't be a DPP sink whose target map work is null.
> The root cause seems to be a malformed operator tree generated by
> SplitOpTreeForDPP.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)