Liya Fan created CALCITE-4465:
---------------------------------
Summary: Estimate the number of distinct values by filter condition
Key: CALCITE-4465
URL: https://issues.apache.org/jira/browse/CALCITE-4465
Project: Calcite
Issue Type: Improvement
Components: core
Reporter: Liya Fan
Assignee: Liya Fan
According to our current implementation ({{RelMdDistinctRowCount}}), estimating
the number of distinctive values (NDV) does not make good use of the filter
condition. It simply forwards the call to its input operator with the fiter
condition attached.
In fact, more information can be obtained for some special but commonly used
conditions. For example, given condition {{x = 'a'}}, we can deduce that
{{NDV(x) <= 1}}. Given condition {{x in ('a', 'b')}}, we can deduce that
{{NDV(x) <= 2}}.
More generally, if we have {{x in ('a', 'b') AND y in ('c', 'd', 'e')}}, we
have {{NDV(x, y) <= 2 * 3 = 6}}.
Thoughts?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)