Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11440 )
Change subject: IMPALA-7351: Improve memory estimates for Kudu Scan Nodes ...................................................................... Patch Set 2: (6 comments) Just a few more cleanup and test suggestions http://gerrit.cloudera.org:8080/#/c/11440/2/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java: http://gerrit.cloudera.org:8080/#/c/11440/2/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@284 PS2, Line 284: int perHostScanRanges = (int) Math.ceil(((double) scanRangeSpecs_ Can we factor out the per-host scan range calculation into a helper? I know it's short but the logic isn't trivial and I think we expect it to be identical between the scan node implementations (unless something changes in future). http://gerrit.cloudera.org:8080/#/c/11440/2/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@296 PS2, Line 296: .setMemEstimateBytes(mem_estimate_per_thread * maxScannerThreads) This new indentation seems weird to me. http://gerrit.cloudera.org:8080/#/c/11440/2/fe/src/main/java/org/apache/impala/planner/ScanNode.java File fe/src/main/java/org/apache/impala/planner/ScanNode.java: http://gerrit.cloudera.org:8080/#/c/11440/2/fe/src/main/java/org/apache/impala/planner/ScanNode.java@232 PS2, Line 232: ComputeMaxNumberOfScannerThreads Initial letter of method name should be lower-case in java. http://gerrit.cloudera.org:8080/#/c/11440/2/fe/src/main/java/org/apache/impala/planner/ScanNode.java@237 PS2, Line 237: maxScannerThreads = 1; We can just return early here (and eliminate the nesting of the else branch. http://gerrit.cloudera.org:8080/#/c/11440/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test: PS1: > Yup, my main intention was to just keep track of how the estimates change o Yeah, that seems fine to update then, no reason to have things diverge :) http://gerrit.cloudera.org:8080/#/c/11440/2/testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test File testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test: http://gerrit.cloudera.org:8080/#/c/11440/2/testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test@5256 PS2, Line 5256: select * from functional_kudu.alltypes Can you also add a scan of a single column and a scan of count(*) and a scan of an unpartitioned table like tpch_kudu.nation, which I think should only be scanned on one node with one thread. -- To view, visit http://gerrit.cloudera.org:8080/11440 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If9bb52530271b0bff91311a67d222a2e9fac1229 Gerrit-Change-Number: 11440 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Mon, 24 Sep 2018 16:14:03 +0000 Gerrit-HasComments: Yes
