Tim Armstrong created IMPALA-10229:
--------------------------------------
Summary: Analytic limit pushdown optimization can be applied
incorrect when there are no analytic predicates
Key: IMPALA-10229
URL: https://issues.apache.org/jira/browse/IMPALA-10229
Project: IMPALA
Issue Type: Bug
Components: Frontend
Reporter: Tim Armstrong
Assignee: Tim Armstrong
{noformat}
[localhost.EXAMPLE.COM:21050] default> select * from (select month, id, rank()
over (partition by month order by id desc) rnk from functional_parquet.alltypes
WHERE month >= 11) v order by month, id limit 3;
+-------+------+-----+
| month | id | rnk |
+-------+------+-----+
| 11 | 6987 | 3 |
| 11 | 6988 | 2 |
| 11 | 6989 | 1 |
+-------+------+-----+
Fetched 3 row(s) in 4.16s
{noformat}
These are not the top 3 rows when ordering by month, id . Hive's result is
correct:
{noformat}
+----------+-------+--------+
| v.month | v.id | v.rnk |
+----------+-------+--------+
| 11 | 3040 | 600 |
| 11 | 3041 | 599 |
| 11 | 3042 | 598 |
+----------+-------+--------+
{noformat}
I think when there's no select predicates, that the ordering in the analytic
sort needs to exactly match the TOP N sort ordering. I'm not sure if there are
fixes needed for the case where there are select predicates.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]