Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11440 )
Change subject: IMPALA-7351: Improve memory estimates for Kudu Scan Nodes ...................................................................... Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/11440/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11440/1//COMMIT_MSG@14 PS1, Line 14: Modified resource requirements planner test. I don't see any changes to this test (it seems like I didn't include a Kudu scan tests case, which is an oversight). I think we should add some tests there. http://gerrit.cloudera.org:8080/#/c/11440/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java: http://gerrit.cloudera.org:8080/#/c/11440/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@91 PS1, Line 91: // Factor capturing the worst-case deviation from a uniform distribution of scan ranges Put this in the ScanNode base class? http://gerrit.cloudera.org:8080/#/c/11440/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@284 PS1, Line 284: // The non-MT scan node requires at least one scanner thread. : int maxScannerThreads; : if (queryOptions.getMt_dop() >= 1) { : maxScannerThreads = 1; : } else { : // TODO: what about heterogeneous hardware, i.e., different num of cores? : maxScannerThreads = Math.min(perHostScanRanges, RuntimeEnv.INSTANCE.getNumCores()); : // Account for the max scanner threads query option. : if (queryOptions.isSetNum_scanner_threads() && : queryOptions.getNum_scanner_threads() > 0) { : maxScannerThreads = : Math.min(maxScannerThreads, queryOptions.getNum_scanner_threads()); : } : } I think this is shared with HdfsScanNode. Can we factor out to a helper function in ScanNode? http://gerrit.cloudera.org:8080/#/c/11440/1/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@298 PS1, Line 298: int num_cols = desc_.getSlots().size(); Can you leave a comment briefly describing the heuristic and mentioning any known limitations? E.g. doesn't model the row batch queue. Otherwise it will be mysterious in a year's time when we've forgotten the context. http://gerrit.cloudera.org:8080/#/c/11440/1/testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test: PS1: You might know already, but we don't validate resource requirements for this test because PlannerTestOption.VALIDATE_RESOURCES isn't set for it. I don't think we need to validate resources here, but didn't want us to think we had coverage that we don't. -- To view, visit http://gerrit.cloudera.org:8080/11440 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If9bb52530271b0bff91311a67d222a2e9fac1229 Gerrit-Change-Number: 11440 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Mon, 17 Sep 2018 22:01:55 +0000 Gerrit-HasComments: Yes
