Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21860 )

Change subject: IMPALA-13405: Do tuple analysis to lower AggregationNode 
cardinality
......................................................................


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/21860/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java
File fe/src/main/java/org/apache/impala/planner/AggregationNode.java:

http://gerrit.cloudera.org:8080/#/c/21860/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@333
PS2, Line 333:     Set<TupleId> tupleIdsToFind = new 
HashSet<>(tupleIdToExprs.keySet());
You could avoid copying the keySet using iterators, essentially 
https://stackoverflow.com/a/1884916.

You could also reimplement with streams, but it gets a little convoluted.


http://gerrit.cloudera.org:8080/#/c/21860/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@335
PS2, Line 335:       if (tupleIdToExprs.get(tupleId).size() == 1) {
nit: If you don't go with an option above, I'd capture the result of 
get(tupleId) to avoid computing the hashcode twice.


http://gerrit.cloudera.org:8080/#/c/21860/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@347
PS2, Line 347:       // Visit all children nodes in post-order traversal to 
ensure that we inspect
Can you expand a bit on why this makes sense for selecting the right 
cardinality?


http://gerrit.cloudera.org:8080/#/c/21860/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@349
PS2, Line 349:       tupleIdsToFind = new HashSet<>(tupleIdToExprs.keySet());
You could update this to use iterators and iter.remove as well. Since it 
updates the underlying HashMap, tupleIdsToFind.isEmpty() and 
tupleIdToExprs.isEmpty() should always match.


http://gerrit.cloudera.org:8080/#/c/21860/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@375
PS2, Line 375:           estimateNumGroups(tupleIdToExprs.get(tupleId), 
aggInputCardinality));
Since you're going to call tupleIdToExprs.get() anyway, might as well iterate 
over tupleIdToExprs.entrySet().



--
To view, visit http://gerrit.cloudera.org:8080/21860
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icd589ab5f7ba9566a0d35784f61f5ffaef5696e7
Gerrit-Change-Number: 21860
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Tue, 01 Oct 2024 18:49:15 +0000
Gerrit-HasComments: Yes

Reply via email to