[ 
https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-4850:
-------------------------------

    Attachment: HIVE-4850.2.patch

This is a working implementation based on current trunk. It is simpler than the 
.1 patch in as it delegates the JOIN entirely to the row-mode MapJoinOperator. 
The vectorized operator is literally calling the row-mode implementaiton for 
each row in the input batch and collects the row-mode forward into the output 
batch. This is not as bad as it seems because the JOIN operators has to resort 
to row-mode operations anyway, due to the small tables (hashtables) being 
row-mode (objects and object-inspectors). By delegating the entire join logic 
to the row mode we piggyback on the correctness of exiting implementation. I do 
plan to come up with a full-vectorized mode implementation but that would 
require changes to the hash table creation-serialization. Note that the 
filtering and key evaluation of the big table *does* use vectorized operators. 
the row mode applies only to the key HT lookup and to the JOIN logic.

> Implement vectorized JOIN operators
> -----------------------------------
>
>                 Key: HIVE-4850
>                 URL: https://issues.apache.org/jira/browse/HIVE-4850
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Remus Rusanu
>            Assignee: Remus Rusanu
>         Attachments: HIVE-4850.1.patch, HIVE-4850.2.patch
>
>
> Easysauce



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to