Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20498 )

Change subject: IMPALA-12018: Consider runtime filter for cardinality reduction
......................................................................


Patch Set 6:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/20498/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20498/5//COMMIT_MSG@18
PS5, Line 18: contiguou
> Nit: contiguous
Done


http://gerrit.cloudera.org:8080/#/c/20498/5//COMMIT_MSG@26
PS5, Line 26:  exe
> Nit: modes.
Done


http://gerrit.cloudera.org:8080/#/c/20498/5//COMMIT_MSG@26
PS5, Line 26: reduction is
> "... reduction is present/available in all ..."
Done


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/JoinNode.java@454
PS5, Line 454: tility m
> Now that it's public, is it still internal?
Removed this.


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/JoinNode.java@475
PS5, Line 475: computeGenericJoinCardina
> This method was renamed.
Done


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/PlanNode.java@1180
PS5, Line 1180: ScanNode, ExchangeNode
> Aren't also ExchangeNodes included?
Done


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java:

http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@710
PS5, Line 710:           || buildSlotRef.getDesc().getParent() == null
> Is it possible that buildSlotRef.getDesc().getParent() is null?
Thanks! I added that check.


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java
File fe/src/main/java/org/apache/impala/planner/ScanNode.java:

http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java@22
PS5, Line 22: import java.util.List;
> Seems to be unused.
Done


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java@25
PS5, Line 25:
> Seems to be unused.
Done


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java@102
PS5, Line 102: tivity_ = 1.0;
> It is not set there, and I can't find any other place where it's set.
Thanks for catching this! This should be assigned at 
ScanNode.reduceCardinalityByRuntimeFilter().


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java@456
PS5, Line 456:    * join node id to list of runtime filters from it.
> Could you describe the algorithm also here?
Added comments.


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java@470
PS5, Line 470:         // Row-level runtime filtering, however, only applies at 
columnar file format.
> Could extract this loop into a function.
Move it to groupFiltersForCardinalityReduction().


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java@474
PS5, Line 474:
> Nit: applies
Done


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java@475
PS5, Line 475:
> Nit: applies (present tense) would be better.
Done


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java@487
PS5, Line 487:    */
> Could extract this loop into a function.
Move it to getReducedCardinalityByFilter().


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java@510
PS5, Line 510: ter.get
> Nit: apply
Done


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java@517
PS5, Line 517:         cardOnThisJoin = Math.min(cardOnThisJoin, 
estCardAfterFilter);
> If 'colName' isn't updated here for more than one column, those columns wil
Yes, that is the intention. More distinct column will lead to lower 
parititonSelectivity. If columns can not be distinguished, pick the least 
selectivity one among them.


http://gerrit.cloudera.org:8080/#/c/20498/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java@528
PS5, Line 528:             colName = targetSlot.getDesc().getColumn().getName();
> I don't understand why we choose the max here, we've been reducing 'reduced
The algorithm relies on the original cardinality estimate and assume scan will 
return at least the same number of rows as the highest join node. Updated the 
code and comment to highlight this.



--
To view, visit http://gerrit.cloudera.org:8080/20498
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I033789c9b63a8188484e3afde8e646563918b3e1
Gerrit-Change-Number: 20498
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: David Rorke <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Thu, 09 Nov 2023 18:39:48 +0000
Gerrit-HasComments: Yes

Reply via email to