Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3748: estimate minimum buffers in planner ......................................................................
Patch Set 8: (15 comments) http://gerrit.cloudera.org:8080/#/c/5847/8/fe/src/main/java/org/apache/impala/planner/PipelinedPlanNodeSet.java File fe/src/main/java/org/apache/impala/planner/PipelinedPlanNodeSet.java: Line 48: public class ResourceRequirement { these aren't requirements if they include the existing "estimates". ResourceProfile? it feels like this should be a standalone class, so it can be attached to plan nodes. Line 115: node.computeCosts(queryOptions); they're not costs Line 156: hdfsScanMemEstimate + hbaseScanMemEstimate + nonScanMemEstimate + dataSinkMemEstimate); long line http://gerrit.cloudera.org:8080/#/c/5847/8/fe/src/main/java/org/apache/impala/planner/PlanFragment.java File fe/src/main/java/org/apache/impala/planner/PlanFragment.java: Line 209: public int getNumInstancesPerHost(TQueryOptions queryOptions) { passing in the dop directly is preferable, because it's easier to see what's going on and it's narrower. Line 232: public long getNumDistinctValues(TQueryOptions queryOptions, List<Expr> exprs) { rename to include "per instance" in the name to clarify. you might also want to abbreviate NumDistinctValues as Ndv to keep this manageable. http://gerrit.cloudera.org:8080/#/c/5847/8/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: Line 120: // estimated per-instance memory usage for this plan node in bytes; the "estimates" don't really have defined semantics (max? desirable?). we might want to deprecate this, and mark it as such, to decrease the amount of confusion generated by having multiple values that look vaguely related. also, "usage" still doesn't mean anything. i'm okay with simply marking this "obsolete". Line 124: // Minimum number of buffers required to execute spilling operator. why restrict this to spilling? doesn't every operator have a defined requirement? Line 281: TQueryOptions queryOptions, TExplainLevel detailLevel) { what's the role of the query options? Line 317: PrintUtils.printInstances(" ", fragment_.getNumInstances(queryOptions))); why do we want this? Line 322: PrintUtils.printPerInstanceMinBuffers(" ", perInstanceMinBufferBytes_)); why make this conditional? http://gerrit.cloudera.org:8080/#/c/5847/8/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test: Line 10: | hosts=1 instances=1 est-per-instance-mem=0B how come the number of hosts changed? http://gerrit.cloudera.org:8080/#/c/5847/5/testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test: Line 15: hosts=1 instances=1 est-per-instance-mem=0B regarding the number of fragment instances: that's really a per-fragment (rather than per execnode) concept, so probably less noisy to print this at the fragment level. http://gerrit.cloudera.org:8080/#/c/5847/8/testdata/workloads/functional-query/queries/QueryTest/explain-level0.test File testdata/workloads/functional-query/queries/QueryTest/explain-level0.test: Line 3 why change this? Line 8: 'Per-Host Resource Estimates: Memory=388.41MB' how come no resource requirements? http://gerrit.cloudera.org:8080/#/c/5847/8/testdata/workloads/functional-query/queries/QueryTest/explain-level3.test File testdata/workloads/functional-query/queries/QueryTest/explain-level3.test: Line 8: 'Per-Host Resource Estimates: Memory=388.41MB' how come this is missing the requirements? -- To view, visit http://gerrit.cloudera.org:8080/5847 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I1e358182bcf2bc5fe5c73883eb97878735b12d37 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
