[ 
https://issues.apache.org/jira/browse/IMPALA-11301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17577018#comment-17577018
 ] 

ASF subversion and git services commented on IMPALA-11301:
----------------------------------------------------------

Commit e5164c89e57de817dede5beca7100fd8fea97565 in impala's branch 
refs/heads/branch-4.1.1 from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e5164c89e ]

IMPALA-11301: Fix extreme != selectivity for NDV=1

The original selectivity of 1.0 - 1.0/ndv makes sense for
large NDVs, but the result is 0.0 in case of ndv==1, which
leads to a cardinality of 1 even for huge tables. The new
selectivity is 0.5.

Note that as the formula for = is not changed (1.0/ndv),
NOT col="const" will still lead to 0.0 selectivity if ndv=1.
Changing the formula of NOT or = would have caused a lot of
subtle changes in plans in tests, so I don't want to touch
those before coming to wider agreement about the correct
approach.

IMPALA-7601 contains some discussion about these formulas.

Testing:
- added a regression test

Change-Id: I6b5334a8d7d6ca46a450ff98ae03e5269faaa3c6
Reviewed-on: http://gerrit.cloudera.org:8080/18543
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Extreme cardinality estimations if NDV=1
> ----------------------------------------
>
>                 Key: IMPALA-11301
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11301
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 4.0.0, Impala 4.1.0
>            Reporter: Csaba Ringhofer
>            Assignee: Csaba Ringhofer
>            Priority: Major
>             Fix For: Impala 4.2.0
>
>
> if ndv of string_col == 1, then string_col != "something const"  will have a 
> selectivity of 0, leading to a cardinality of 1 regardless of number of rows
> This issue was introduced in IMPALA-10677



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to