Csaba Ringhofer has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18543
Change subject: WIP: IMPALA-11301: Fix = and != selectivity for very low NDVs ...................................................................... WIP: IMPALA-11301: Fix = and != selectivity for very low NDVs The original selectivity of 1.0/ndv and 1.0 - 1.0/ndv make sense for large NDVs, but the result is 1.0 and 0.0 in case of ndv==1, which can lead to extremely wrong cardinality estimates. Adding 1 to the ndv in the formulas also accounts for the possibility of 0 matching elements in the dataset and helps avoiding extreme selectivity. IMPALA-7601 contains some discussion about these formulas. The change is in WIP state because I had not time to run the tests yet. Change-Id: I6b5334a8d7d6ca46a450ff98ae03e5269faaa3c6 --- M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/18543/1 -- To view, visit http://gerrit.cloudera.org:8080/18543 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I6b5334a8d7d6ca46a450ff98ae03e5269faaa3c6 Gerrit-Change-Number: 18543 Gerrit-PatchSet: 1 Gerrit-Owner: Csaba Ringhofer <[email protected]>
