[
https://issues.apache.org/jira/browse/HIVE-18148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16297994#comment-16297994
]
Rui Li commented on HIVE-18148:
-------------------------------
Both the target table size and the DPP sink output size (smaller output means
more partitions are pruned) should be taken into account, if we want to base
the decision on statistics. Besides we also need to consider the cost of
re-computing, as I mentioned above. Let's put that as follow up.
> NPE in SparkDynamicPartitionPruningResolver
> -------------------------------------------
>
> Key: HIVE-18148
> URL: https://issues.apache.org/jira/browse/HIVE-18148
> Project: Hive
> Issue Type: Bug
> Components: Spark
> Reporter: Rui Li
> Assignee: Rui Li
> Attachments: HIVE-18148.1.patch, HIVE-18148.2.patch
>
>
> The stack trace is:
> {noformat}
> 2017-11-27T10:32:38,752 ERROR [e6c8aab5-ddd2-461d-b185-a7597c3e7519 main]
> ql.Driver: FAILED: NullPointerException null
> java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.optimizer.physical.SparkDynamicPartitionPruningResolver$SparkDynamicPartitionPruningDispatcher.dispatch(SparkDynamicPartitionPruningResolver.java:100)
> at
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
> at
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
> at
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
> at
> org.apache.hadoop.hive.ql.optimizer.physical.SparkDynamicPartitionPruningResolver.resolve(SparkDynamicPartitionPruningResolver.java:74)
> at
> org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeTaskPlan(SparkCompiler.java:568)
> {noformat}
> At this stage, there shouldn't be a DPP sink whose target map work is null.
> The root cause seems to be a malformed operator tree generated by
> SplitOpTreeForDPP.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)