[ 
https://issues.apache.org/jira/browse/HIVE-13809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13809:
-----------------------------
       Resolution: Fixed
    Fix Version/s: 2.2.0
                   2.1.0
           Status: Resolved  (was: Patch Available)

Thanks [~gopalv] for the review. Committed to master and branch-2.1.

> Hybrid Grace Hash Join memory usage estimation didn't take into account the 
> bloom filter size
> ---------------------------------------------------------------------------------------------
>
>                 Key: HIVE-13809
>                 URL: https://issues.apache.org/jira/browse/HIVE-13809
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 2.0.0, 2.1.0
>            Reporter: Wei Zheng
>            Assignee: Wei Zheng
>             Fix For: 2.1.0, 2.2.0
>
>         Attachments: HIVE-13809.1.patch
>
>
> Memory estimation is important during hash table loading, because we need to 
> make the decision of whether to load the next hash partition in memory or 
> spill it. If the assumption is there's enough memory but it turns out not the 
> case, we will run into OOM problem.
> Currently hybrid grace hash join memory usage estimation didn't take into 
> account the bloom filter size. In large test cases (TB scale) the bloom 
> filter grows as big as hundreds of MB, big enough to cause estimation error.
> The solution is to count in the bloom filter size into memory estimation.
> Another issue this patch will fix is possible NPE due to object cache reuse 
> during hybrid grace hash join.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to