Kurt Deschler has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19682 )

Change subject: IMPALA-12006: Improve cardinality estimation for joins 
involving multiple conjuncts
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19682/6/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

http://gerrit.cloudera.org:8080/#/c/19682/6/fe/src/main/java/org/apache/impala/planner/JoinNode.java@496
PS6, Line 496:       if (corrfactor > 0) cumulative_sel *= (((double) 
joinCard/lhsCard)/rhsCard);
> On line 500 we are dividing the cumulative selectivity by the corrfactor. T
This would be more readable as (double) joinCard/(lhsCard*rhsCard);

Given that you are taking the min selectivity across the local join conditions 
using multiplication on every iteration, corrfactor should also be divided each 
time and not below. Otherwise, the impact of the coefficient will vary 
depending on the number of conditions. In most cases, there will be more 
correlation when there are more conditions.


http://gerrit.cloudera.org:8080/#/c/19682/6/fe/src/main/java/org/apache/impala/planner/JoinNode.java@500
PS6, Line 500:       result = (long) Math.min(result, ((cumulative_sel * 
lhsCard) * rhsCard)/corrfactor);
This would be more readable as (lhsCard * rhsCard)* cumulative_sel / corrfactor)

However, I'm not sure why it is appropriate to do the multiplication again here.



--
To view, visit http://gerrit.cloudera.org:8080/19682
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I845d778a58404af834f7501fc8157a5a4b4bcc35
Gerrit-Change-Number: 19682
Gerrit-PatchSet: 6
Gerrit-Owner: Aman Sinha <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Mon, 10 Apr 2023 20:11:28 +0000
Gerrit-HasComments: Yes

Reply via email to