Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/21860 )
Change subject: IMPALA-13405: Do tuple analysis to lower AggregationNode cardinality ...................................................................... Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/21860/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java File fe/src/main/java/org/apache/impala/planner/AggregationNode.java: http://gerrit.cloudera.org:8080/#/c/21860/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@333 PS2, Line 333: Set<TupleId> tupleIdsToFind = new HashSet<>(tupleIdToExprs.keySet()); You could avoid copying the keySet using iterators, essentially https://stackoverflow.com/a/1884916. You could also reimplement with streams, but it gets a little convoluted. http://gerrit.cloudera.org:8080/#/c/21860/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@335 PS2, Line 335: if (tupleIdToExprs.get(tupleId).size() == 1) { nit: If you don't go with an option above, I'd capture the result of get(tupleId) to avoid computing the hashcode twice. http://gerrit.cloudera.org:8080/#/c/21860/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@347 PS2, Line 347: // Visit all children nodes in post-order traversal to ensure that we inspect Can you expand a bit on why this makes sense for selecting the right cardinality? http://gerrit.cloudera.org:8080/#/c/21860/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@349 PS2, Line 349: tupleIdsToFind = new HashSet<>(tupleIdToExprs.keySet()); You could update this to use iterators and iter.remove as well. Since it updates the underlying HashMap, tupleIdsToFind.isEmpty() and tupleIdToExprs.isEmpty() should always match. http://gerrit.cloudera.org:8080/#/c/21860/2/fe/src/main/java/org/apache/impala/planner/AggregationNode.java@375 PS2, Line 375: estimateNumGroups(tupleIdToExprs.get(tupleId), aggInputCardinality)); Since you're going to call tupleIdToExprs.get() anyway, might as well iterate over tupleIdToExprs.entrySet(). -- To view, visit http://gerrit.cloudera.org:8080/21860 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icd589ab5f7ba9566a0d35784f61f5ffaef5696e7 Gerrit-Change-Number: 21860 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Yida Wu <[email protected]> Gerrit-Comment-Date: Tue, 01 Oct 2024 18:49:15 +0000 Gerrit-HasComments: Yes
