[jira] [Created] (HIVE-8700) Replace ReduceSink to HashTableSink (or equi.) for small tables [Spark Branch]

Xuefu Zhang (JIRA) Sun, 02 Nov 2014 07:23:58 -0800

Xuefu Zhang created HIVE-8700:
---------------------------------

             Summary: Replace ReduceSink to HashTableSink (or equi.) for small 
tables [Spark Branch]
                 Key: HIVE-8700
                 URL: https://issues.apache.org/jira/browse/HIVE-8700
             Project: Hive
          Issue Type: Sub-task
          Components: Spark
            Reporter: Xuefu Zhang
            Assignee: Szehon Ho



With HIVE-8616 enabled, the new plan has ReduceSinkOperator for the small 
tables. For example, the follow represents the operator plan for the small 
table dec1 derived from query {code}explain select /*+ MAPJOIN(dec)*/ * from 
dec join dec1 on dec.value=dec1.d;{code}
{code}
        Map 2 
            Map Operator Tree:
                TableScan
                  alias: dec1
                  Statistics: Num rows: 0 Data size: 107 Basic stats: PARTIAL 
Column stats: NONE
                  Filter Operator
                    predicate: d is not null (type: boolean)
                    Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
Column stats: NONE
                    Reduce Output Operator
                      key expressions: d (type: decimal(5,2))
                      sort order: +
                      Map-reduce partition columns: d (type: decimal(5,2))
                      Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
Column stats: NONE
                      value expressions: i (type: int)

{code}
With the new design for broadcasting small tables, we need to convert the 
ReduceSinkOperator with HashTableSinkOperator or equivalent in the new plan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HIVE-8700) Replace ReduceSink to HashTableSink (or equi.) for small tables [Spark Branch]

Reply via email to