[
https://issues.apache.org/jira/browse/HIVE-15278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15702566#comment-15702566
]
Gunther Hagleitner commented on HIVE-15278:
-------------------------------------------
LGTM +1. This does look like it'd be painful to debug. Is it possible to add a
small test to avoid this debug pain for the next person?
One thing I'm not completely sure of: The bug is that the join operator is
trying to pump records through it's parents after they have been closed. It's
doing that to finish the last pending group when the first of it's parents is
closed. Your fix finishes the group after the first parent is closed not the
last - do you know for a fact that the join operator won't try to push records
through that (closed) parent? (I think that's the case because it's the big
table side and all remaining records should be from other branches).
> PTF+MergeJoin = NPE
> -------------------
>
> Key: HIVE-15278
> URL: https://issues.apache.org/jira/browse/HIVE-15278
> Project: Hive
> Issue Type: Bug
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Attachments: HIVE-15278.patch
>
>
> Manifests as
> {noformat}
> Caused by: java.lang.NullPointerException
> at
> org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer.first(PTFRowContainer.java:115)
> at
> org.apache.hadoop.hive.ql.exec.PTFPartition.iterator(PTFPartition.java:114)
> at
> org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:340)
> at
> org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838)
> at
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
> at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343)
> ... 29 more
> {noformat}
> It's actually a somewhat subtle ordering problem in sortmerge - as it stands,
> it calls different branches of the tree in closeOp after they themselves have
> already been closed. Other operators that clean stuff up in close may result
> in different errors. The common pattern is
> {noformat}
> 1125 at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352)
> 1126 at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:274)
> 1127 at
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.fetchOneRow(CommonMergeJoinOperator.java:404)
> ...
> 1131 at
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.joinFinalLeftData(CommonMergeJoinOperator.java:428)
> 1132 at
> org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.closeOp(CommonMergeJoinOperator.java:388)
> 1133 at
> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:617)
> ...
> 1139 at
> org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.close(ReduceRecordProcessor.java:294)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)