[
https://issues.apache.org/jira/browse/HIVE-11262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Dere updated HIVE-11262:
------------------------------
Attachment: HIVE-11262.2.patch
Updating patch regarding size of HybridHashTableContainer, based on feedback
from [~wzheng] - the on-disk hash partition size is based on both the on-disk
hash table as well as the side table. Added a new field to keep track of the
on-disk hash table size.
Also made a couple of small fixes to HybridHashTableContainer:
- In MapJoin.reloadHashTable(), totalInMemRowCount was double counting the size
of the side table, since the restoreHashMap had already added the side table
values.
- hashMapOnDisk was not being reset to false when the on-disk hash table was
being cleaned up.
> Skip MapJoin processing if the join hash table is empty
> -------------------------------------------------------
>
> Key: HIVE-11262
> URL: https://issues.apache.org/jira/browse/HIVE-11262
> Project: Hive
> Issue Type: Improvement
> Components: Query Processor
> Reporter: Jason Dere
> Assignee: Jason Dere
> Attachments: HIVE-11262.1.patch, HIVE-11262.2.patch
>
>
> Currently the map join processor processes all rows of the big table, even
> when the hash table is empty. If it is an inner join, we should be able to
> skip the join processing, since the result should be empty.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)