[ 
https://issues.apache.org/jira/browse/HIVE-25589?focusedWorklogId=773885&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773885
 ]

ASF GitHub Bot logged work on HIVE-25589:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 24/May/22 06:16
            Start Date: 24/May/22 06:16
    Worklog Time Spent: 10m 
      Work Description: kgyrtkirk commented on code in PR #3266:
URL: https://github.com/apache/hive/pull/3266#discussion_r880096997


##########
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:
##########
@@ -1839,6 +1839,15 @@ boolean doPhase1(ASTNode ast, QB qb, Phase1Ctx ctx_1, 
PlannerContext plannerCtx)
             
doPhase1GetDistinctFuncExprs(qbp.getAggregationExprsForClause(ctx_1.dest)));
         break;
 
+      case HiveParser.TOK_QUALIFY:
+        if (!HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_CBO_ENABLED)) {

Review Comment:
   I think when we fall-back to the non-cbo path (more likely in nature) this 
flag will not be changed see around:
   
https://github.com/apache/hive/blob/35d4532b0c08f4f5fbb5dc897c4330cba434bc7c/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L706
   
   I think `set hive.cbo.fallback.strategy=ALWAYS` could also be used to check 
if that would happen





Issue Time Tracking
-------------------

    Worklog Id:     (was: 773885)
    Time Spent: 50m  (was: 40m)

> SQL: Implement HAVING/QUALIFY predicates for ROW_NUMBER()=1
> -----------------------------------------------------------
>
>                 Key: HIVE-25589
>                 URL: https://issues.apache.org/jira/browse/HIVE-25589
>             Project: Hive
>          Issue Type: Improvement
>          Components: CBO, SQL
>    Affects Versions: 4.0.0
>            Reporter: Gopal Vijayaraghavan
>            Assignee: Krisztian Kasa
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> The insert queries which use a row_num()=1 function are inconvenient to write 
> or port from an existing workload, because there is no easy way to ignore a 
> column in this pattern.
> {code}
> INSERT INTO main_table 
> SELECT * from duplicated_table
> QUALIFY ROW_NUMER() OVER (PARTITION BY event_id) = 1;
> {code}
> needs to be rewritten into
> {code}
> INSERT INTO main_table
> select event_id, event_ts, event_attribute, event_metric1, event_metric2, 
> event_metric3, event_metric4, .., event_metric43 from 
> (select *, ROW_NUMBER() OVER (PARTITION BY event_id) as rnum from 
> duplicated_table)
> where rnum=1;
> {code}
> This is a time-consuming and error-prone rewrite (dealing with a messed up 
> order of columns between one source and dest table).
> An alternate rewrite would be to do the same or similar syntax using HAVING. 
> {code}
> INSERT INTO main_table 
> SELECT * from duplicated_table
> HAVING ROW_NUMER() OVER (PARTITION BY event_id) = 1;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to