[
https://issues.apache.org/jira/browse/HIVE-25606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17427628#comment-17427628
]
Krisztian Kasa edited comment on HIVE-25606 at 10/12/21, 11:31 AM:
-------------------------------------------------------------------
It seems that the order by clause is intentionally dropped during parsing:
{code:java}
explain ast
select * from (select * from test_1009 order by id desc) a limit 10;
{code}
{code:java}
ABSTRACT SYNTAX TREE:
TOK_QUERY
TOK_FROM
TOK_SUBQUERY
TOK_QUERY
TOK_FROM
TOK_TABREF
TOK_TABNAME
test_1009
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_ALLCOLREF
a
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_ALLCOLREF
TOK_LIMIT
10
{code}
When the Limit is moved after Order by to the subquery Hive parses the Order by
too since it has an affect on the overall query result:
{code:java}
explain ast
select * from (select * from test_1009 order by id desc limit 10) a;
{code}
{code:java}
ABSTRACT SYNTAX TREE:
TOK_QUERY
TOK_FROM
TOK_SUBQUERY
TOK_QUERY
TOK_FROM
TOK_TABREF
TOK_TABNAME
test_1009
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_ALLCOLREF
TOK_ORDERBY
TOK_TABSORTCOLNAMEDESC
TOK_NULLS_FIRST
TOK_TABLE_OR_COL
id
TOK_LIMIT
10
a
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_ALLCOLREF
{code}
Some other database engines also drop the order by clause without limit from
subquery or throw exception.
As far as I know SQL standard allows Order by in the main query only but many
database engines allows it in subquery however it is not guaranteed that it has
an effect.
was (Author: kkasa):
It seems that the order by clause is intentionally dropped during parsing:
{code:java}
explain ast
select * from (select * from test_1009 order by id desc) a limit 10;
{code}
{code:java}
ABSTRACT SYNTAX TREE:
TOK_QUERY
TOK_FROM
TOK_SUBQUERY
TOK_QUERY
TOK_FROM
TOK_TABREF
TOK_TABNAME
test_1009
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_ALLCOLREF
a
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_ALLCOLREF
TOK_LIMIT
10
{code}
When the Limit is moved after Order by to the subquery Hive parses the Order by
too since it has an affect of the overall query result:
{code:java}
explain ast
select * from (select * from test_1009 order by id desc limit 10) a;
{code}
{code:java}
ABSTRACT SYNTAX TREE:
TOK_QUERY
TOK_FROM
TOK_SUBQUERY
TOK_QUERY
TOK_FROM
TOK_TABREF
TOK_TABNAME
test_1009
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_ALLCOLREF
TOK_ORDERBY
TOK_TABSORTCOLNAMEDESC
TOK_NULLS_FIRST
TOK_TABLE_OR_COL
id
TOK_LIMIT
10
a
TOK_INSERT
TOK_DESTINATION
TOK_DIR
TOK_TMP_FILE
TOK_SELECT
TOK_SELEXPR
TOK_ALLCOLREF
{code}
Some other database engines also drop the order by clause without limit from
subquery or throw exception.
As far as I know SQL standard allows Order by in the main query only but many
database engines allows it in subquery however it is not guaranteed that it has
an effect.
> The global limit invalidates the subquery order by
> --------------------------------------------------
>
> Key: HIVE-25606
> URL: https://issues.apache.org/jira/browse/HIVE-25606
> Project: Hive
> Issue Type: Bug
> Components: CBO
> Affects Versions: 3.1.1
> Reporter: jinwensc
> Priority: Major
>
> create table test_1009(id int);
> insert into table test_1009
> values(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13);
> select * from (select * from test_1009 order by id desc) a limit 10;
>
> +-------+
> | a.id |
> +-------+
> | 1 |
> | 2 |
> | 3 |
> | 4 |
> | 5 |
> | 6 |
> | 7 |
> | 8 |
> | 9 |
> | 10 |
> +–+
>
> -----------------------
> the right result should be
> +-------+
> | a.id |
> +-------+
> | 13 |
> | 12 |
> | 11 |
> | 10 |
> | 9 |
> | 8 |
> | 7 |
> | 6 |
> | 5 |
> | 4 |
> +-------+
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)