Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16942 )

Change subject: IMPALA-10296: Fix analytic limit pushdown when predicates are 
present
......................................................................


Patch Set 11:

(12 comments)

Sending comments based on 1st pass of the planner changes.

http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG@9
PS11, Line 9: when
nit: remove


http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG@22
PS11, Line 22: limit
did you mean partition limit >= order by limit ?


http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG@32
PS11, Line 32: was
nit: 'is'


http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG@40
PS11, Line 40: This patch implements tie handling in the backend (I took most
I had previously wondered about how the planner and backend  work for this 
functionality would be combined but it now starts falling into place with the 
handling of the duplicates :-)


http://gerrit.cloudera.org:8080/#/c/16942/11//COMMIT_MSG@68
PS11, Line 68: The for
nit: 'The elapsed time for..' ?


http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
File fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java:

http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@507
PS11, Line 507: upper top-n.
Since we do a distributed top-n, there are 3 top-n's in the plan and the 'upper 
top-n' may be confusing. Here it refers to the outermost top-n or final top-n 
or something  similar.


http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@511
PS11, Line 511: doesn't matter
nit: worth clarifying that it doesn't matter for the purpose of the pushdown 
decision.


http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@530
PS11, Line 530: include all of the rows in the final
This does not literally mean all rows in the final partition right ? Should it 
be all eligible rows or all relevant rows ? (based on the P+N value)


http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@531
PS11, Line 531: was
nit: 'we'


http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java@603
PS11, Line 603:     if (analyticLimit < limit) return falseStatus;
One special case where this could work is if each partition had a maximum of 
analyticLimit rows.  Then we know we are not excluding rows as we iterate 
through the partitions until we reach the final limit.  Of course, this 
knowledge is not readily available or may not be relied upon due to ndv 
estimates but in practice I suspect this may not be uncommon.
For this patch, it makes sense to be conservative in applying the optimization.


http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/SortNode.java
File fe/src/main/java/org/apache/impala/planner/SortNode.java:

http://gerrit.cloudera.org:8080/#/c/16942/11/fe/src/main/java/org/apache/impala/planner/SortNode.java@83
PS11, Line 83: row.s
nit: 'rows'


http://gerrit.cloudera.org:8080/#/c/16942/11/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-analytic.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-analytic.test:

http://gerrit.cloudera.org:8080/#/c/16942/11/testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-analytic.test@1139
PS11, Line 1139: # rank() predicate is not pushed down because TOPN_BYTES_LIMIT 
prevents conversion
The plan shows the lower top-n which indicates the rank was pushed down. 
Perhaps the top_bytes_limit should be even smaller if this is supposed to be a 
negative test.



--
To view, visit http://gerrit.cloudera.org:8080/16942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I801d7799b0d649c73d2dd1703729a9b58a662509
Gerrit-Change-Number: 16942
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Sun, 17 Jan 2021 00:36:51 +0000
Gerrit-HasComments: Yes

Reply via email to