[
https://issues.apache.org/jira/browse/HIVE-15235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sergey Shelukhin updated HIVE-15235:
------------------------------------
Description:
The relevant fragment of the reduce plan of a Tez job is as follows:
{noformat}
<MERGEJOIN>Id =13
<Children>
<SEL>Id =12
<Children>
<MAPJOIN>Id =10
<Children>
...
<\Children>
<Parent>Id = 12 nullId = 9
<HASHTABLEDUMMY>Id =9
<Children>null
<\Children>
<Parent><\Parent>
<\HASHTABLEDUMMY><\Parent>
<\MAPJOIN>
<\Children>
<Parent>Id = 13 null<\Parent>
<\SEL>
<\Children>
{noformat}
When sortmergejoin is enabled, during initialization, dummy operators are not
initialized (presumably, they are not present in the work); that results in
MapJoin not being initialized, even though its proper parent is.
Manifests as an NPE
{noformat}
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:350)
{noformat}
was:
{noformat}
<MERGEJOIN>Id =13
<Children>
<SEL>Id =12
<Children>
<MAPJOIN>Id =10
<Children>
<FS>Id =11
<Children>
<\Children>
<Parent>Id = 10 null<\Parent>
<\FS>
<\Children>
<Parent>Id = 12 nullId = 9
<HASHTABLEDUMMY>Id =9
<Children>null
<\Children>
<Parent><\Parent>
<\HASHTABLEDUMMY><\Parent>
<\MAPJOIN>
<\Children>
<Parent>Id = 13 null<\Parent>
<\SEL>
<\Children>
{noformat}
This only happens when sortmergejoin is enabled.
This is on reduce size of a Tez job; during initialization, dummy operators are
not initialized (presumably, they are not present in the work); that results in
MapJoin not being initialized, even though its proper parent is.
Manifests as an NPE
{noformat}
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:350)
{noformat}
> sortmergejoin can produce incorrect plan wrt dummy operators
> ------------------------------------------------------------
>
> Key: HIVE-15235
> URL: https://issues.apache.org/jira/browse/HIVE-15235
> Project: Hive
> Issue Type: Bug
> Reporter: Sergey Shelukhin
>
> The relevant fragment of the reduce plan of a Tez job is as follows:
> {noformat}
> <MERGEJOIN>Id =13
> <Children>
> <SEL>Id =12
> <Children>
> <MAPJOIN>Id =10
> <Children>
> ...
> <\Children>
> <Parent>Id = 12 nullId = 9
> <HASHTABLEDUMMY>Id =9
> <Children>null
> <\Children>
> <Parent><\Parent>
> <\HASHTABLEDUMMY><\Parent>
> <\MAPJOIN>
> <\Children>
> <Parent>Id = 13 null<\Parent>
> <\SEL>
> <\Children>
> {noformat}
> When sortmergejoin is enabled, during initialization, dummy operators are not
> initialized (presumably, they are not present in the work); that results in
> MapJoin not being initialized, even though its proper parent is.
> Manifests as an NPE
> {noformat}
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.process(MapJoinOperator.java:350)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)