Oleksiy Sayankin created HIVE-21802: ---------------------------------------
Summary: Unexpected change in HiveQL clause order Key: HIVE-21802 URL: https://issues.apache.org/jira/browse/HIVE-21802 Project: Hive Issue Type: Bug Components: Parser, Query Processor Reporter: Oleksiy Sayankin Assignee: Oleksiy Sayankin This query worked in Hive 1.2 ( ({{ORDER}} clause _before_ {{WINDOW}})): {code:java} CREATE TABLE ccdp_v02 AS SELECT * from (select cust_xref_id, cstone_last_updatetm, instal_late_pay_ct, ROW_NUMBER() over w1 as RN, a.report_dt from cstonedb3.gam_ccdp_us a where report_dt = '2019-05-01' and cust_xref_id in (1234) order by cust_xref_id, a.report_dt, cstone_last_updatetm desc WINDOW w1 as (partition by a.cust_xref_id, a.report_dt order by a.cstone_last_updatetm desc) ) tmp where RN=1 DISTRIBUTE BY report_dt; {code} In Hive2.1 it fails with: {code:java} hive> SELECT id > FROM ( > SELECT > id, > a1, > ROW_NUMBER() OVER w1 AS RN, > b1 > FROM i a > ORDER BY id, b1, a1 DESC > WINDOW w1 as (PARTITION BY id, b1 ORDER BY a1 DESC) > ); NoViableAltException(257@[]) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:2269) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:2479) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1692) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1313) at org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:42092) at org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:36765) at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:37017) at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:36663) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:35852) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:35740) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2307) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1335) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:208) at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:77) at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:70) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:836) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:774) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:697) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) FAILED: ParseException line 3:4 cannot recognize input near '(' 'SELECT' 'id' in joinSource hive> {code} *STEPS TO REPRODUCE:* 1. Create a table: {code:java} CREATE TABLE i (id INT, a1 INT, b1 BOOLEAN); {code} 2. Run the query which was working in Hive-1.2: ({{ORDER}} clause _before_ {{WINDOW}}) {code:java} SELECT id FROM ( SELECT id, a1, ROW_NUMBER() OVER w1 AS rn, b1 FROM i a ORDER BY id, b1, a1 DESC WINDOW w1 as (PARTITION BY id, b1 ORDER BY a1 DESC) ) tmp WHERE rn=1 DISTRIBUTE BY id; {code} *ACTUAL RESULT:* The query fails with an exception you can find above. The query from Step 2 which works for Hive-2.3 is ( ({{ORDER}} clause _after_ {{WINDOW}})): {code:java} SELECT id FROM ( SELECT id, a1, ROW_NUMBER() OVER w1 AS rn, b1 FROM i a WINDOW w1 as (PARTITION BY id, b1 ORDER BY a1 DESC) ORDER BY id, b1, a1 DESC ) tmp WHERE rn=1 DISTRIBUTE BY id; {code} Hive-2.3 also fails to parse subquery ( ({{ORDER}} clause _before_ {{WINDOW}})): {code:java} SELECT id, a1, ROW_NUMBER() OVER w1 AS rn, b1 FROM i a ORDER BY id, b1, a1 DESC WINDOW w1 as (PARTITION BY id, b1 ORDER BY a1 DESC) {code} Customer is facing issue regularly this is occurring after upgrade . the workaround given by us need lot of code changes which will get them back to squire one they need to follow process and will cost them a lot. -- This message was sent by Atlassian JIRA (v7.6.3#76005)