Soumyakanti Das created CALCITE-4703:
Summary: getDistinctRowCount returns NULL for Converter and
TableScan
Key: CALCITE-4703
URL: https://issues.apache.org/jira/browse/CALCITE-4703
Project: Calcite
Issue Type: Improvement
Components: core
Affects Versions: 1.27.0
Reporter: Soumyakanti Das
The catch-all method
[getDistinctRowCount|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/metadata/RelMdDistinctRowCount.java#L76]
returns NULL when the RelNode is a Converter or TableScan/JdbcTableScan. It
would be nice to have overloaded methods to handle these types.
Currently in Hive, we get a NPE for the following query, while computing
selectivity:
{code:java}
explain cbo
with t1 as (select fkey, ikey, sum(dkey) as dk_sum, sum(dkey2) as dk2_sum
from ext_simple_derby_table1 left join ext_simple_derby_table3
on ikey = ikey2
where fkey2 is null
group by fkey, ikey),
t2 as (select datekey, fkey, ikey, sum(dkey) as dk_sum2, sum(dkey2) as dk2_sum2
from ext_simple_derby_table2 left join ext_simple_derby_table4
on ikey = ikey2
where fkey2 is null
group by datekey, fkey, ikey)
select t1.fkey, t2.ikey, sum(t1.ikey)
from t1 left join t2
on t1.ikey = t2.ikey AND t1.fkey = t2.fkey
where t2.fkey is null
group by t2.datekey, t1.fkey, t2.ikey
{code}
Stacktrace:
{code:java}
java.lang.NullPointerException
at
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdSelectivity.getSelectivity(HiveRelMdSelectivity.java:78)
at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source)
at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source)
at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source)
at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426)
at
org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:765)
at
org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:131)
at
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRowCount.getRowCount(HiveRelMdRowCount.java:175)
at
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRuntimeRowCount.getRowCount(HiveRelMdRuntimeRowCount.java:53)
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:212)
at
org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:205)
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:212)
at
org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:140)
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:212)
at
org.apache.calcite.rel.metadata.RelMdUtil.getJoinRowCount(RelMdUtil.java:723)
at org.apache.calcite.rel.core.Join.estimateRowCount(Join.java:205)
at
org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRowCount.getRowCount(HiveRelMdRowCount.java:113)
at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source)
at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source)
at
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:212)
at
org.apache.hadoop.hive.ql.optimizer.calcite.stats.FilterSelectivityEstimator.(FilterSelectivityEstimator.java:62)
at
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveFilterSortPredicates$RexSortPredicatesShuttle.(HiveFilterSortPredicates.java:126)
at
org.apache.hadoop.hive.ql.optimizer.calcite.rules.HiveFilterSortPredicates$RexSortPredicatesShuttle.(HiveFilterSortPredicates.java:120)
at