[ 
https://issues.apache.org/jira/browse/HIVE-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liyin Tang updated HIVE-1754:
-----------------------------

    Status: Patch Available  (was: Open)

This patch modifies the following things
1) Remove the JDBM from Hive
2) All the data in the small table will be stored in in-memory hashtable.
3) Create a light-weight RowContainer: MapJoinRowContainer.
4) Optimize MapJoinObjectKey. If there are only one join key or two join keys, 
it will use MapJoinSingleKey or MapJoinDoulbeKeys instead of MapJoinObjectKey.

> Remove JDBM component from Map Join
> -----------------------------------
>
>                 Key: HIVE-1754
>                 URL: https://issues.apache.org/jira/browse/HIVE-1754
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>    Affects Versions: 0.6.0, 0.7.0
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>             Fix For: 0.7.0
>
>         Attachments: Hive-1754.patch
>
>
> Right now, JDBM is the major performance bottleneck of performance.
> With the growth of the small table, the PUT and GET operation will take most 
> of execution time.
> Map Join is designed to load the data of small table into memory. 
> If the data is too large to hold in memory, then there is no need to use the 
> map join strategy.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to