[ 
https://issues.apache.org/jira/browse/HIVE-11262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-11262:
------------------------------
    Attachment: HIVE-11262.2.patch

Updating patch regarding size of HybridHashTableContainer, based on feedback 
from [~wzheng] - the on-disk hash partition size is based on both the on-disk 
hash table as well as the side table. Added a new field to keep track of the 
on-disk hash table size.

Also made a couple of small fixes to HybridHashTableContainer:
- In MapJoin.reloadHashTable(), totalInMemRowCount was double counting the size 
of the side table, since the restoreHashMap had already added the side table 
values.
- hashMapOnDisk was not being reset to false when the on-disk hash table was 
being cleaned up.

> Skip MapJoin processing if the join hash table is empty
> -------------------------------------------------------
>
>                 Key: HIVE-11262
>                 URL: https://issues.apache.org/jira/browse/HIVE-11262
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Jason Dere
>            Assignee: Jason Dere
>         Attachments: HIVE-11262.1.patch, HIVE-11262.2.patch
>
>
> Currently the map join processor processes all rows of the big table, even 
> when the hash table is empty. If it is an inner join, we should be able to 
> skip the join processing, since the result should be empty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to