liuyao has posted comments on this change. ( http://gerrit.cloudera.org:8080/17637 )
Change subject: IMPALA-10766: Better selectivity for =,not distinct ...................................................................... Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/17637/6/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java: http://gerrit.cloudera.org:8080/#/c/17637/6/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@288 PS6, Line 288: selectivity_ = selectivity_ * (double) (numRows - numNulls) / numRows + : numNulls / numRows; > nit. In this case, can we use selectivity = #nulls / #rows directly? This calculation formula may be (1- 1/ndv)* (numRows - numNulls) / numRows + numNulls / numRows. Some non-null values satisfy "is distinct from not-null". ie. 'aa' is distinct from 'bb' -- To view, visit http://gerrit.cloudera.org:8080/17637 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib8ec62f2355a7036125cc0d261b790644b9f4b60 Gerrit-Change-Number: 17637 Gerrit-PatchSet: 7 Gerrit-Owner: liuyao <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: liuyao <[email protected]> Gerrit-Comment-Date: Tue, 13 Jul 2021 02:36:56 +0000 Gerrit-HasComments: Yes
