Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/16842 )
Change subject: IMPALA-10377: Improve the accuracy of resource estimation PlanNode does not consider some factors when estimating memory, this will cause a large error rate ...................................................................... Patch Set 16: (8 comments) Thanks for your contribution. The change lgtm, I've mostly found style issues. http://gerrit.cloudera.org:8080/#/c/16842/16//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16842/16//COMMIT_MSG@8 PS16, Line 8: PlanNode does not consider some factors when estimating memory, nit: Please put a blank line after the title http://gerrit.cloudera.org:8080/#/c/16842/16//COMMIT_MSG@23 PS16, Line 23: Datas nit: just 'data', or 'data rows'? http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java File fe/src/main/java/org/apache/impala/planner/HashJoinNode.java: http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java@220 PS16, Line 220: computeJoinResourceProfile Could you please add a comment to describe the overal calculation? http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java@230 PS16, Line 230: long rhsCard = getChild(1).getCardinality(); : long rhsNdv = 1; : for (Expr eqJoinPredicate: eqJoinConjuncts_) { : long rhsPdNdv = getNdv(eqJoinPredicate.getChild(1)); : rhsPdNdv = Math.min(rhsPdNdv, rhsCard); : if (rhsPdNdv != -1) { : rhsNdv = PlanNode.checkedMultiply(rhsNdv, rhsPdNdv); : } : } Could you please add some comments for the computation of rhsNdv? http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java File fe/src/test/java/org/apache/impala/planner/CardinalityTest.java: http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@838 PS16, Line 838: 10L * 1024L * 1024L nit: probably we should just use MIN_HASH_TBL_MEM here http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@921 PS16, Line 921: : : : : nit: too many blank lines http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@1101 PS16, Line 1101: /* nit: doc comments start with /** http://gerrit.cloudera.org:8080/#/c/16842/16/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@1143 PS16, Line 1143: isDistributedPlan Seems like all tests use distributed plans. It'd be good to add a few tests for non distributed plans. -- To view, visit http://gerrit.cloudera.org:8080/16842 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1 Gerrit-Change-Number: 16842 Gerrit-PatchSet: 16 Gerrit-Owner: liuyao <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 23 Feb 2021 16:54:33 +0000 Gerrit-HasComments: Yes
