[jira] [Commented] (HIVE-9277) Hybrid Hybrid Grace Hash Join

Wei Zheng (JIRA) Mon, 02 Mar 2015 17:18:24 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-9277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14344229#comment-14344229
 ]


Wei Zheng commented on HIVE-9277:
---------------------------------

Right now I'm using HIVECONVERTJOINNOCONDITIONALTASK as a threshold to do 
estimation. Once the memory management part is ready, I can rely on that to 
provide me an exact number.

> Hybrid Hybrid Grace Hash Join
> -----------------------------
>
>                 Key: HIVE-9277
>                 URL: https://issues.apache.org/jira/browse/HIVE-9277
>             Project: Hive
>          Issue Type: New Feature
>          Components: Physical Optimizer
>            Reporter: Wei Zheng
>            Assignee: Wei Zheng
>              Labels: join
>         Attachments: HIVE-9277.01.patch, HIVE-9277.02.patch, 
> HIVE-9277.03.patch, HIVE-9277.04.patch, HIVE-9277.05.patch, 
> HIVE-9277.06.patch, High-leveldesignforHybridHybridGraceHashJoinv1.0.pdf
>
>
> We are proposing an enhanced hash join algorithm called _“hybrid hybrid grace 
> hash join”_.
> We can benefit from this feature as illustrated below:
> * The query will not fail even if the estimated memory requirement is 
> slightly wrong
> * Expensive garbage collection overhead can be avoided when hash table grows
> * Join execution using a Map join operator even though the small table 
> doesn't fit in memory as spilling some data from the build and probe sides 
> will still be cheaper than having to shuffle the large fact table
> The design was based on Hadoop’s parallel processing capability and 
> significant amount of memory available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9277) Hybrid Hybrid Grace Hash Join

Reply via email to