[ 
https://issues.apache.org/jira/browse/HIVE-23723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17140588#comment-17140588
 ] 

Attila Magyar commented on HIVE-23723:
--------------------------------------

[~jcamachorodriguez], thanks for letting me know, I haven't realized it. Any 
idea why it is disabled by default? 

Also the plan looks different with limittranspose, not sure why. There are 3 
Limit operators. The first one is what was pushed through the LOJ. But there 
are 2 others in Reducer 2.

 
{code:java}
explain
SELECT src1.key, src2.value FROM src src1 LEFT OUTER JOIN src src2 ON (src1.key 
= src2.key) LIMIT 5; {code}
{code:java}
PREHOOK: query: explain
SELECT src1.key, src2.value FROM src src1 LEFT OUTER JOIN src src2 ON (src1.key 
= src2.key) LIMIT 5
PREHOOK: type: QUERY
PREHOOK: Input: default@src
#### A masked pattern was here ####
POSTHOOK: query: explain
SELECT src1.key, src2.value FROM src src1 LEFT OUTER JOIN src src2 ON (src1.key 
= src2.key) LIMIT 5
POSTHOOK: type: QUERY
POSTHOOK: Input: default@src
#### A masked pattern was here ####
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1STAGE PLANS:
  Stage: Stage-1
    Tez
#### A masked pattern was here ####
      Edges:
        Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
        Reducer 3 <- Map 4 (SIMPLE_EDGE), Reducer 2 (SIMPLE_EDGE)
#### A masked pattern was here ####
      Vertices:
        Map 1 
            Map Operator Tree:
                TableScan
                  alias: src1
                  Statistics: Num rows: 500 Data size: 43500 Basic stats: 
COMPLETE Column stats: COMPLETE
                  Select Operator
                    expressions: key (type: string)
                    outputColumnNames: _col0
                    Statistics: Num rows: 500 Data size: 43500 Basic stats: 
COMPLETE Column stats: COMPLETE
                    Limit
                      Number of rows: 5
                      Statistics: Num rows: 5 Data size: 435 Basic stats: 
COMPLETE Column stats: COMPLETE
                      Reduce Output Operator
                        null sort order: 
                        sort order: 
                        Statistics: Num rows: 5 Data size: 435 Basic stats: 
COMPLETE Column stats: COMPLETE
                        TopN Hash Memory Usage: 0.3
                        value expressions: _col0 (type: string)
            Execution mode: vectorized, llap
            LLAP IO: no inputs
        Map 4 
            Map Operator Tree:
                TableScan
                  alias: src2
                  filterExpr: key is not null (type: boolean)
                  Statistics: Num rows: 500 Data size: 89000 Basic stats: 
COMPLETE Column stats: COMPLETE
                  Filter Operator
                    predicate: key is not null (type: boolean)
                    Statistics: Num rows: 500 Data size: 89000 Basic stats: 
COMPLETE Column stats: COMPLETE
                    Select Operator
                      expressions: key (type: string), value (type: string)
                      outputColumnNames: _col0, _col1
                      Statistics: Num rows: 500 Data size: 89000 Basic stats: 
COMPLETE Column stats: COMPLETE
                      Reduce Output Operator
                        key expressions: _col0 (type: string)
                        null sort order: z
                        sort order: +
                        Map-reduce partition columns: _col0 (type: string)
                        Statistics: Num rows: 500 Data size: 89000 Basic stats: 
COMPLETE Column stats: COMPLETE
                        value expressions: _col1 (type: string)
            Execution mode: vectorized, llap
            LLAP IO: no inputs
        Reducer 2 
            Execution mode: vectorized, llap
            Reduce Operator Tree:
              Limit
                Number of rows: 5
                Statistics: Num rows: 5 Data size: 435 Basic stats: COMPLETE 
Column stats: COMPLETE
                Select Operator
                  expressions: VALUE._col0 (type: string)
                  outputColumnNames: _col0
                  Statistics: Num rows: 5 Data size: 435 Basic stats: COMPLETE 
Column stats: COMPLETE
                  Limit
                    Number of rows: 5
                    Statistics: Num rows: 5 Data size: 435 Basic stats: 
COMPLETE Column stats: COMPLETE
                    Reduce Output Operator
                      key expressions: _col0 (type: string)
                      null sort order: z
                      sort order: +
                      Map-reduce partition columns: _col0 (type: string)
                      Statistics: Num rows: 5 Data size: 435 Basic stats: 
COMPLETE Column stats: COMPLETE
        Reducer 3 
            Execution mode: llap
            Reduce Operator Tree:
              Merge Join Operator
                condition map:
                     Left Outer Join 0 to 1
                keys:
                  0 _col0 (type: string)
                  1 _col0 (type: string)
                outputColumnNames: _col0, _col2
                Statistics: Num rows: 12 Data size: 1772 Basic stats: COMPLETE 
Column stats: COMPLETE
                Select Operator
                  expressions: _col0 (type: string), _col2 (type: string)
                  outputColumnNames: _col0, _col1
                  Statistics: Num rows: 12 Data size: 1772 Basic stats: 
COMPLETE Column stats: COMPLETE
                  File Output Operator
                    compressed: false
                    Statistics: Num rows: 12 Data size: 1772 Basic stats: 
COMPLETE Column stats: COMPLETE
                    table:
                        input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                        output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
                        serde: 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  Stage: Stage-0
    Fetch Operator
      limit: 5
      Processor Tree:
        ListSink {code}

> Limit operator pushdown through LOJ
> -----------------------------------
>
>                 Key: HIVE-23723
>                 URL: https://issues.apache.org/jira/browse/HIVE-23723
>             Project: Hive
>          Issue Type: Improvement
>          Components: Hive
>            Reporter: Attila Magyar
>            Assignee: Attila Magyar
>            Priority: Major
>             Fix For: 4.0.0
>
>         Attachments: HIVE-23723.1.patch
>
>
> Limit operator (without an order by) can be pushed through SELECTS and LEFT 
> OUTER JOINs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to