Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16864 )
Change subject: IMPALA-10287: Include parallelism in cost comparison of broadcast vs partition ...................................................................... Patch Set 1: (2 comments) Code change makes sense and seems fine. I have some questions about whether this is going far enough, but it sounds like you maybe have plans to improve further - is there a follow-on JIRA for that work? http://gerrit.cloudera.org:8080/#/c/16864/1/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: http://gerrit.cloudera.org:8080/#/c/16864/1/common/thrift/ImpalaInternalService.thrift@473 PS1, Line 473: 118: optional bool use_dop_for_costing = false; I'd consider enabling it by default, it seems like a fairly clear win particularly given the log() function making the estimate of the benefit fairly conservative. http://gerrit.cloudera.org:8080/#/c/16864/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java: http://gerrit.cloudera.org:8080/#/c/16864/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java@515 PS1, Line 515: log1p I wonder if natural log reduces the influence of DOP too much, e.g. log(16) = 2.7, log(64) = 4.2 and presumably the speedup in hash table build time would be more than that. sqrt is also pretty conservative but seems to produce numbers that would be a bit closer to what we'd expect sqrt(16) = 4, sqrt(64) = 8. Not a blocker for sure. Edit: oh I see you're planning to investigate more. -- To view, visit http://gerrit.cloudera.org:8080/16864 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idff569299e5c78720ca17c616a531adac78208e1 Gerrit-Change-Number: 16864 Gerrit-PatchSet: 1 Gerrit-Owner: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Mon, 14 Dec 2020 20:43:35 +0000 Gerrit-HasComments: Yes
