[ 
https://issues.apache.org/jira/browse/IMPALA-10229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17210505#comment-17210505
 ] 

Tim Armstrong commented on IMPALA-10229:
----------------------------------------

A standalone example. It's easier to reproduce with num_nodes=1 btw, since the 
distributed top-n tends to filter out fewer rows.
{noformat}
create table ranktbl(part int, ord int) stored as parquet;
insert into ranktbl values (1, 1), (1, 2), (1, 3), (1, 4), (2, 1), (2, 2), (2, 
3), (2, 4);
set num_nodes=1;
select * from (
  select *, rank() over (partition by part order by ord) rnk from ranktbl) v
where rnk < 5
order by part, ord desc
limit 5;
+------+-----+-----+
| part | ord | rnk |
+------+-----+-----+
| 1    | 4   | 4   |
| 1    | 3   | 3   |
| 1    | 2   | 2   |
| 1    | 1   | 1   |
| 2    | 1   | 1   |
+------+-----+-----+
select * from (select *, rank() over (partition by part order by ord) rnk from 
ranktbl) v order by part, ord desc limit 2;

+------+-----+-----+
| part | ord | rnk |
+------+-----+-----+
| 1    | 2   | 2   |
| 1    | 1   | 1   |
+------+-----+-----+
{noformat}

I'm thinking this through, but I think we need some stronger preconditions for 
the optimization.

* Case 1: the sort expressions fully match.
In this case the row order is the same between the output of the top-n and the 
input of the analytic. So the first N rows returned from the top-n will be the 
same rows as the output of the analytic sort *if* there are no rows filtered 
out by a SELECT node in-between. So we need to ensure that the SELECT 
predicates are always true. This would be true for, say, rank() <= 1000 if 
there was a limit 100, since it will definitely be true for the first 100 rows.

* Case 2: only the partition exprs match
In this case the rows returned are going to come from the first partitions, 
i.e. if the limit is 100, if we go through the partitions in order until the 
row count adds up to 100, then we know that the rows much come from those 
partitions.

The issue here is that, for the final partition, if the sort order is 
different, then the rows returned from the top-n may not be the first rows in 
the partition for the analytic. So if we filter out the first rows in the 
partition before they are fed into the analytic, the results of the analytic 
function can change.

One solution would be to increase the pushed down limit so that it was always 
guaranteed to include all of the rows in the final partition to be returned. 
E.g. if you had a row_number() <= 100 predicate and limit 100, if you pushed 
down limit 200, then you'd be guaranteed to capture all of the rows in the 
final partition. That may be trickier when it comes to rank() where you need to 
deal with ties (since the top-n operator doesn't currently handle ties).


> Analytic limit pushdown optimization can be applied incorrect when there are 
> no analytic predicates
> ---------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-10229
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10229
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Blocker
>              Labels: correctness
>
> {noformat}
> [localhost.EXAMPLE.COM:21050] default> select * from (select month, id, 
> rank() over (partition by month order by id desc) rnk from 
> functional_parquet.alltypes WHERE month >= 11) v order by month, id limit 3;
> +-------+------+-----+
> | month | id   | rnk |
> +-------+------+-----+
> | 11    | 6987 | 3   |
> | 11    | 6988 | 2   |
> | 11    | 6989 | 1   |
> +-------+------+-----+
> Fetched 3 row(s) in 4.16s
> {noformat}
> These are not the top 3 rows when ordering by month, id . Hive's result is 
> correct:
> {noformat}
> +----------+-------+--------+
> | v.month  | v.id  | v.rnk  |
> +----------+-------+--------+
> | 11       | 3040  | 600    |
> | 11       | 3041  | 599    |
> | 11       | 3042  | 598    |
> +----------+-------+--------+
> {noformat}
> I think when there's no select predicates, that the ordering in the analytic 
> sort needs to exactly match the TOP N sort ordering. I'm not sure if there 
> are fixes needed for the case where there are select predicates.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to