[ 
https://issues.apache.org/jira/browse/CALCITE-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liya Fan updated CALCITE-4465:
------------------------------
    Description: 
According to our current implementation ({{RelMdDistinctRowCount}}), estimating 
the number of distinctive values (NDV) does not make good use of the filter 
condition. It simply forwards the call to its input operator with the fiter 
condition attached.
In fact, more information can be obtained for some special but commonly used 
conditions. For example, given condition {{x = 'a'}}, we can deduce that {{NDV( 
x ) <= 1}}. Given condition {{x in ('a', 'b')}}, we can deduce that {{NDV( x ) 
<= 2}}.
More generally, if we have {{x in ('a', 'b') AND y in ('c', 'd', 'e')}}, we 
have {{NDV(x, y) <= 2 * 3 = 6}}.

Thoughts?

  was:
According to our current implementation ({{RelMdDistinctRowCount}}), estimating 
the number of distinctive values (NDV) does not make good use of the filter 
condition. It simply forwards the call to its input operator with the fiter 
condition attached.
In fact, more information can be obtained for some special but commonly used 
conditions. For example, given condition {{x = 'a'}}, we can deduce that 
{{NDV(x) <= 1}}. Given condition {{x in ('a', 'b')}}, we can deduce that 
{{NDV(x) <= 2}}.
More generally, if we have {{x in ('a', 'b') AND y in ('c', 'd', 'e')}}, we 
have {{NDV(x, y) <= 2 * 3 = 6}}.

Thoughts?


> Estimate the number of distinct values by filter condition
> ----------------------------------------------------------
>
>                 Key: CALCITE-4465
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4465
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Liya Fan
>            Assignee: Liya Fan
>            Priority: Major
>
> According to our current implementation ({{RelMdDistinctRowCount}}), 
> estimating the number of distinctive values (NDV) does not make good use of 
> the filter condition. It simply forwards the call to its input operator with 
> the fiter condition attached.
> In fact, more information can be obtained for some special but commonly used 
> conditions. For example, given condition {{x = 'a'}}, we can deduce that 
> {{NDV( x ) <= 1}}. Given condition {{x in ('a', 'b')}}, we can deduce that 
> {{NDV( x ) <= 2}}.
> More generally, if we have {{x in ('a', 'b') AND y in ('c', 'd', 'e')}}, we 
> have {{NDV(x, y) <= 2 * 3 = 6}}.
> Thoughts?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to