[
https://issues.apache.org/jira/browse/IMPALA-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18029387#comment-18029387
]
ASF subversion and git services commented on IMPALA-14158:
----------------------------------------------------------
Commit cde4bc016c02cf582f2469083392b0bcc7f2bf56 in impala's branch
refs/heads/master from Steve Carlin
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=cde4bc016 ]
IMPALA-14115: Calcite planner: Added top-n analytic PlanNode optimization.
Impala has an optimization for analytic expressions that have a rank filter on
top of the analytic expression. It can add a top-n plan node to reduce the
amount
of rows examined. This is tested in tpcds query 67.
The optimization logic relies on an unassigned rank conjunct within the analyzer
while creating the analytic plan node.
A slight reorganization of the code was needed to implement this optimization.
The SlotRefs for the AnalyticInfo needed to be created a little earlier from
where it was done in the previous commit.
A small fix was made to normalize binary predicates. A non-normalized binary
predicate prevents the optimization from being used.
A call to the checkAndApplyLimitPushdown is needed for some of the optimizations
to kick in.
A new AllProjectInfo internal class was created to hold the relationships
between the Calcite RexNode objects and the Impala Analytic expressions.
Also, IMPALA-14158 is fixed by this commit. The nullsFirst value was
incorrect when the syntax was explicit in the query.
A new Calcite planner test was added in the junit tests to ensure the
optimization kicks in. The new test file is in the
PlannerTest/calcite/limit-pushdown-analytic-calcite.test file. This is a copy
of the limit-pushdown-analytic.test file in its parent directory but with some
modified results. Most of the differences are trivial, but IMPALA-14469 has been
filed to deal with one optimization that did not get fixed, which is when
the order by clause has a constant expression.
Change-Id: Ie6fa6781db56771b13b0cf49bd236f776016bf8d
Reviewed-on: http://gerrit.cloudera.org:8080/23317
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Aman Sinha <[email protected]>
> Calcite planner: nulls last not being handled correctly for analytical
> function
> -------------------------------------------------------------------------------
>
> Key: IMPALA-14158
> URL: https://issues.apache.org/jira/browse/IMPALA-14158
> Project: IMPALA
> Issue Type: Sub-task
> Reporter: Steve Carlin
> Priority: Major
>
> The following query is putting nulls first in
> analytic-fns-tpcds-partitioned-topn.test
> select *
> from (
> select s_store_name, s_state, ss_list_price,
> rank() over (partition by s_store_name, s_state order by
> ss_list_price desc nulls last) rnk
> from store_sales ss
> join store s on ss_store_sk = s_store_sk) v
> where rnk <= 5;
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]