[
https://issues.apache.org/jira/browse/CALCITE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119284#comment-17119284
]
Xiening Dai commented on CALCITE-3963:
--------------------------------------
I create a PR - https://github.com/apache/calcite/pull/1992/files please take
a look. It only addresses row count, but I could make similar changes for other
logical properties. But just want to hear some feedback before moving forward.
The basic idea is to always use set.rel (renamed into set.originalRel) for
reporting logical properties of a given set. And we add a new interface
LogicalNode which can report the confidence of its stats estimation. Whenever a
new LogicalNode is added into the set, we check if its confidence level is
higher than set.rel, and update it if that is the case. There are some use
cases where the confidence level is useful. For example, we treat TableScan
with high confidence level and that would make sure in materialized view, the
table scan node would be used for reporting row count instead of other
equivalent rel nodes. That makes sense since it's been materialized and we can
safely assume that it has the most accurate row count. And we definitely can
extend that interface to handle more complex scenarios in the future.
> Maintain logical properties at RelSet (equivalent group) instead of RelNode
> ---------------------------------------------------------------------------
>
> Key: CALCITE-3963
> URL: https://issues.apache.org/jira/browse/CALCITE-3963
> Project: Calcite
> Issue Type: Bug
> Reporter: Xiening Dai
> Assignee: Xiening Dai
> Priority: Major
>
> Currently the logical properties (such as row count, distinct row count, etc)
> are maintained at RelNode level. This creates a number of meta data
> consistency problems, e.g. CALCITE-1048, CALCITE-2166.
> In theory, all RelNodes in a RelSet should share the same logical properties
> per definition of relational equivalence. So it makes more sense to keep
> logical properties at RelSet level, rather than the RelNode. And such
> properties shouldn't change when new sub set is created or subset's best is
> changed.
> Specifically I think below build in metadata should fall into the logical
> properties category -
> Selectivity
> UniqueKeys
> ColumnUniqueness
> RowCount
> MaxRowCount
> MinRowCount
> DistinctRowCount
> Size (averageRowSize, averageColumnSize)
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)