[
https://issues.apache.org/jira/browse/HIVE-8934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225640#comment-14225640
]
Chao commented on HIVE-8934:
----------------------------
The reason for this issue is that {{MapJoinTableContainerSerDe#load}} sometimes
will overwrite values for existing keys, since different input files may
contain the same key. I have to change {{MapJoinEagerRowContainer#read}} as
well, to make it not to reset rows everytime. I think this is OK as this method
is only called in three places, and all of them initialize a new instance of
this class before calling the method, so the reset is not necessary.
> Investigate test failure on bucketmapjoin10.q and bucketmapjoin11.q [Spark
> Branch]
> ----------------------------------------------------------------------------------
>
> Key: HIVE-8934
> URL: https://issues.apache.org/jira/browse/HIVE-8934
> Project: Hive
> Issue Type: Sub-task
> Components: Spark
> Affects Versions: spark-branch
> Reporter: Chao
> Assignee: Chao
> Attachments: HIVE-8934.1-spark.patch
>
>
> With MapJoin enabled, these two tests will generate incorrect results.
> This seem to be related to the HiveInputFormat that these two are using.
> We need to investigate the issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)