Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/19682 )
Change subject: IMPALA-12006: Improve cardinality estimation for joins involving multiple conjuncts ...................................................................... Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/19682/6/fe/src/main/java/org/apache/impala/planner/JoinNode.java File fe/src/main/java/org/apache/impala/planner/JoinNode.java: http://gerrit.cloudera.org:8080/#/c/19682/6/fe/src/main/java/org/apache/impala/planner/JoinNode.java@496 PS6, Line 496: if (corrfactor > 0) cumulative_sel *= (((double) joinCard/lhsCard)/rhsCard); > On line 500 we are dividing the cumulative selectivity by the corrfactor. T This would be more readable as (double) joinCard/(lhsCard*rhsCard); Given that you are taking the min selectivity across the local join conditions using multiplication on every iteration, corrfactor should also be divided each time and not below. Otherwise, the impact of the coefficient will vary depending on the number of conditions. In most cases, there will be more correlation when there are more conditions. http://gerrit.cloudera.org:8080/#/c/19682/6/fe/src/main/java/org/apache/impala/planner/JoinNode.java@500 PS6, Line 500: result = (long) Math.min(result, ((cumulative_sel * lhsCard) * rhsCard)/corrfactor); This would be more readable as (lhsCard * rhsCard)* cumulative_sel / corrfactor) However, I'm not sure why it is appropriate to do the multiplication again here. -- To view, visit http://gerrit.cloudera.org:8080/19682 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35 Gerrit-Change-Number: 19682 Gerrit-PatchSet: 6 Gerrit-Owner: Aman Sinha <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Mon, 10 Apr 2023 20:11:28 +0000 Gerrit-HasComments: Yes
