Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17387 )

Change subject: IMPALA-10681: Improve join cardinality estimates
......................................................................


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/17387/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17387/3//COMMIT_MSG@38
PS3, Line 38: TODO: We would want to run a performance test to validate
            :    the plan changes for TPC-DS at a sufficiently high scale 
factor.
Just out of curiosity I did a single node perf run on TPCDS with scale 10

 bin/single_node_perf_run.py --scale 10 --iterations 10 --table_formats 
parquet/snap --workloads tpcds <git hash>

And looked at the changed queries:

Q4, Q11: no real change

Q5: ~8% improvement
Q54: ~5% improvement
Q71: ~15% improvement
Q74: ~10% improvement

But of course, it'd be more interesting to see the results at higher scale on a 
real cluster.


http://gerrit.cloudera.org:8080/#/c/17387/3/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/17387/3/fe/src/main/java/org/apache/impala/analysis/Expr.java@1859
PS3, Line 1859:   public Expr getSlotDescFirstSourceExpr() {
nit: could you please add a brief comment for this method?


http://gerrit.cloudera.org:8080/#/c/17387/3/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/17387/3/fe/src/main/java/org/apache/impala/planner/JoinNode.java@306
PS3, Line 306: getOtherJoinCardinality
nit: could you mention this calculation in the doc comment of this method?


http://gerrit.cloudera.org:8080/#/c/17387/3/fe/src/main/java/org/apache/impala/planner/JoinNode.java@447
PS3, Line 447: after IMPALA-7310 is fixed
I think this part is fixed by https://gerrit.cloudera.org/#/c/16349/


http://gerrit.cloudera.org:8080/#/c/17387/3/fe/src/main/java/org/apache/impala/planner/JoinNode.java@507
PS3, Line 507: return new EqJoinConjunctScanSlots(eqJoinConjunct, lhsScanSlot, 
rhsScanSlot);
             :       } else {
nit: Since we return in the 'true' branch, we don't need to nest the statements 
in an 'else' block. I.e. to reduce nesting, we can just have:

  if (hasLhs && hasRhs) {
    return new EqJoinConjunctScanSlots(eqJoinConjunct, lhsScanSlot, 
rhsScanSlot);
  }
  Expr lhsExpr = eqJoinConjunct.getChild(0);
  ...


http://gerrit.cloudera.org:8080/#/c/17387/3/fe/src/main/java/org/apache/impala/planner/JoinNode.java@591
PS3, Line 591:   public static final class NdvAndRowCountStats {
nit: Could you please add a brief comment?



--
To view, visit http://gerrit.cloudera.org:8080/17387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8aa9d3b8f3c4848b3e9414fe19ad7ad348d12ecc
Gerrit-Change-Number: 17387
Gerrit-PatchSet: 3
Gerrit-Owner: Aman Sinha <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Wed, 19 May 2021 15:29:11 +0000
Gerrit-HasComments: Yes

Reply via email to