[jira] [Updated] (HIVE-12369) Native Vector GroupBy (Part 1)

Matt McCline (JIRA) Sat, 21 Apr 2018 15:38:39 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Matt McCline updated HIVE-12369:
--------------------------------
    Description: 
Implement Native Vector GroupBy using fast hash table technology developed for 
Native Vector MapJoin, etc.

Patch is currently limited to a single COUNT aggregation.  Or, no aggregation 
also known as duplicate reduction.

Here are examples of new kinds of classes introduces that stored the count in 
the slot table and don't allocate hash elements:
{noformat}
  COUNT(column)  VectorGroupByHashLongKeySingleCountColumnOperator      
  COUNT(key)     VectorGroupByHashLongKeySingleCountKeyOperator            
  COUNT(*)       VectorGroupByHashLongKeySingleCountStarOperator           
{noformat}
And the duplicate reduction operator a single key.  Example:
{noformat}
  VectorGroupByHashLongKeyDuplicateReductionOperator
{noformat}

  was:
Implement Native Vector GroupBy using fast hash table technology developed for 
Native Vector MapJoin, etc.

Patch is currently limited to a single key with a single COUNT aggregation.  
Or, a single key and no aggregation also known as duplicate reduction.

3 new kinds of classes introduces that stored the count in the slot table and 
don't allocate hash elements.  Example:
{noformat}
  COUNT(column)  VectorGroupByHashLongKeySingleCountColumnOperator      
  COUNT(key)     VectorGroupByHashLongKeySingleCountKeyOperator            
  COUNT(*)       VectorGroupByHashLongKeySingleCountStarOperator           
{noformat}
And the duplicate reduction operator a single key.  Example:
{noformat}
  VectorGroupByHashLongKeyDuplicateReductionOperator
{noformat}


> Native Vector GroupBy (Part 1)
> ------------------------------
>
>                 Key: HIVE-12369
>                 URL: https://issues.apache.org/jira/browse/HIVE-12369
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>         Attachments: HIVE-12369.01.patch, HIVE-12369.02.patch, 
> HIVE-12369.05.patch, HIVE-12369.06.patch, HIVE-12369.091.patch, 
> HIVE-12369.094.patch, HIVE-12369.095.patch, HIVE-12369.096.patch, 
> HIVE-12369.097.patch, HIVE-12369.098.patch
>
>
> Implement Native Vector GroupBy using fast hash table technology developed 
> for Native Vector MapJoin, etc.
> Patch is currently limited to a single COUNT aggregation.  Or, no aggregation 
> also known as duplicate reduction.
> Here are examples of new kinds of classes introduces that stored the count in 
> the slot table and don't allocate hash elements:
> {noformat}
>   COUNT(column)  VectorGroupByHashLongKeySingleCountColumnOperator      
>   COUNT(key)     VectorGroupByHashLongKeySingleCountKeyOperator            
>   COUNT(*)       VectorGroupByHashLongKeySingleCountStarOperator           
> {noformat}
> And the duplicate reduction operator a single key.  Example:
> {noformat}
>   VectorGroupByHashLongKeyDuplicateReductionOperator
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-12369) Native Vector GroupBy (Part 1)

Reply via email to