[ https://issues.apache.org/jira/browse/CALCITE-7173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alessandro Solimando updated CALCITE-7173: ------------------------------------------ Fix Version/s: 1.41.0 > Improve RelMdDistinctRowCount estimation for lossless casts > ----------------------------------------------------------- > > Key: CALCITE-7173 > URL: https://issues.apache.org/jira/browse/CALCITE-7173 > Project: Calcite > Issue Type: Improvement > Components: core > Affects Versions: 1.40.0 > Reporter: Alessandro Solimando > Assignee: Alessandro Solimando > Priority: Minor > Labels: pull-request-available > Fix For: 1.41.0 > > > Consider the following test for _RelMetadataTest_: > {code:java} > @Test > void testAggregateDistinctRowCountLosslessCast() { > final String values = "values ('b', 10), ('b', 20), ('b', 30)"; > final String sql = > "select name, cast(sal as varchar(11)) from (" + values + ") t(name, > sal) " + > "group by name, cast(sal as varchar(11))"; > sql(sql).assertThatDistinctRowCount(bitSetOf(1), is(3d)); > } > {code} > The test currently fails as follows: > {noformat} > Expected: is <3.0> > but: was <1.6439107033725735> > {noformat} > For lossless casts (and in general for injective functions), one would expect > "NDV(CAST($i)) = NDV($i)" to hold. > A minimal fix would enhance > [RelMdUtil.java#L596|https://github.com/apache/calcite/blob/calcite-1.40.0/core/src/main/java/org/apache/calcite/rel/metadata/RelMdUtil.java#L596] > to consider lossless casts as references to input fields, since it's only > used in > [RelMdDistinctRowCount|https://github.com/apache/calcite/blob/calcite-1.40.0/core/src/main/java/org/apache/calcite/rel/metadata/RelMdDistinctRowCount.java#L258] > and with the same exact spirit in > [RelMdPopulationSize|https://github.com/apache/calcite/blob/calcite-1.40.0/core/src/main/java/org/apache/calcite/rel/metadata/RelMdPopulationSize.java#L138]. -- This message was sent by Atlassian Jira (v8.20.10#820010)