[jira] [Commented] (HIVE-17342) Where condition with 1=0 should be treated similar to limit 0

Krisztian Kasa (Jira) Tue, 30 Aug 2022 22:40:23 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-17342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17598180#comment-17598180
 ]


Krisztian Kasa commented on HIVE-17342:
---------------------------------------

[~amansinha]
Found that in some cases CBO ends up with a plan having a {{HiveProject}} root 
node instead of {{{}HiveSortLimit(fetch=[0]){}}}. This is translated to 
subquery at the Hive physical level and the limit 0 optimization does not used.

By setting
{code:java}
set hive.optimize.limittranspose=true;
{code}
enables {{HiveProjectSortTransposeRule}} which can push through {{HiveProject}} 
and {{HiveSortLimit(fetch=[0])}} becomes the new root.
For your query:
{code:java}
POSTHOOK: query: explain cbo
select y from (select a1 y from t1 where b1 > 10) q WHERE 1=0
CBO PLAN:
HiveSortLimit(fetch=[0])
  HiveProject(y=[$0])
    HiveFilter(condition=[>($1, 10)])
      HiveTableScan(table=[[default, t1]], table:alias=[t1])
{code}
{code:java}
POSTHOOK: query: explain
select y from (select a1 y from t1 where b1 > 10) q WHERE 1=0
Plan optimized by CBO.

Stage-0
  Fetch Operator
    limit:0
{code}

> Where condition with 1=0 should be treated similar to limit 0
> -------------------------------------------------------------
>
>                 Key: HIVE-17342
>                 URL: https://issues.apache.org/jira/browse/HIVE-17342
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Rajesh Balamohan
>            Assignee: Krisztian Kasa
>            Priority: Minor
>
> In some cases, queries may get executed with where condition mentioning to 
> "1=0" to get schema. E.g 
> {noformat}
> SELECT * FROM (select avg(d_year) as  y from date_dim where d_year>1999) q 
> WHERE 1=0
> {noformat}
> Currently hive executes the query; it would be good to consider this similar 
> to "limit 0" which does not execute the query.
> {code}
> hive> explain SELECT * FROM (select avg(d_year) as  y from date_dim where 
> d_year>1999) q WHERE 1=0;
> OK
> Plan optimized by CBO.
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
>     limit:-1
>     Stage-1
>       Reducer 2 vectorized, llap
>       File Output Operator [FS_13]
>         Group By Operator [GBY_12] (rows=1 width=76)
>           Output:["_col0"],aggregations:["avg(VALUE._col0)"]
>         <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized, llap
>           PARTITION_ONLY_SHUFFLE [RS_11]
>             Group By Operator [GBY_10] (rows=1 width=76)
>               Output:["_col0"],aggregations:["avg(d_year)"]
>               Filter Operator [FIL_9] (rows=1 width=0)
>                 predicate:false
>                 TableScan [TS_0] (rows=1 width=0)
>                   
> default@date_dim,date_dim,Tbl:PARTIAL,Col:NONE,Output:["d_year"]
> {code}
> It does generate 0 splits, but does send a DAG plan to the AM and receive 0 
> rows as output.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-17342) Where condition with 1=0 should be treated similar to limit 0

Reply via email to