Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16864 )

Change subject: IMPALA-10287: Include parallelism in cost comparison of 
broadcast vs partition
......................................................................


Patch Set 1:

(2 comments)

Code change makes sense and seems fine. I have some questions about whether 
this is going far enough, but it sounds like you maybe have plans to improve 
further - is there a follow-on JIRA for that work?

http://gerrit.cloudera.org:8080/#/c/16864/1/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/16864/1/common/thrift/ImpalaInternalService.thrift@473
PS1, Line 473:   118: optional bool use_dop_for_costing = false;
I'd consider enabling it by default, it seems like a fairly clear win 
particularly given the log() function making the estimate of the benefit fairly 
conservative.


http://gerrit.cloudera.org:8080/#/c/16864/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java:

http://gerrit.cloudera.org:8080/#/c/16864/1/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java@515
PS1, Line 515: log1p
I wonder if natural log reduces the influence of DOP too much, e.g. log(16) = 
2.7, log(64) = 4.2 and presumably the speedup in hash table build time would be 
more than that. sqrt is also pretty conservative but seems to produce numbers 
that would be a bit closer to what we'd expect sqrt(16) = 4, sqrt(64) = 8.

Not a blocker for sure.

Edit: oh I see you're planning to investigate more.



--
To view, visit http://gerrit.cloudera.org:8080/16864
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idff569299e5c78720ca17c616a531adac78208e1
Gerrit-Change-Number: 16864
Gerrit-PatchSet: 1
Gerrit-Owner: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Mon, 14 Dec 2020 20:43:35 +0000
Gerrit-HasComments: Yes

Reply via email to