[ 
https://issues.apache.org/jira/browse/CALCITE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17119284#comment-17119284
 ] 

Xiening Dai commented on CALCITE-3963:
--------------------------------------

I create a PR - https://github.com/apache/calcite/pull/1992/files  please take 
a look. It only addresses row count, but I could make similar changes for other 
logical properties. But just want to hear some feedback before moving forward.

The basic idea is to always use set.rel (renamed into set.originalRel) for 
reporting logical properties of a given set. And we add a new interface 
LogicalNode which can report the confidence of its stats estimation. Whenever a 
new LogicalNode is added into the set, we check if its confidence level is 
higher than set.rel, and update it if that is the case. There are some use 
cases where the confidence level is useful. For example, we treat TableScan 
with high confidence level and that would make sure in materialized view, the 
table scan node would be used for reporting row count instead of other 
equivalent rel nodes. That makes sense since it's been materialized and we can 
safely assume that it has the most accurate row count. And we definitely can 
extend that interface to handle more complex scenarios in the future.

> Maintain logical properties at RelSet (equivalent group) instead of RelNode
> ---------------------------------------------------------------------------
>
>                 Key: CALCITE-3963
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3963
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Xiening Dai
>            Assignee: Xiening Dai
>            Priority: Major
>
> Currently the logical properties (such as row count, distinct row count, etc) 
> are maintained at RelNode level. This creates a number of meta data 
> consistency problems, e.g. CALCITE-1048, CALCITE-2166. 
> In theory, all RelNodes in a RelSet should share the same logical properties 
> per definition of relational equivalence. So it makes more sense to keep 
> logical properties at RelSet level, rather than the RelNode. And such 
> properties shouldn't change when new sub set is created or subset's best is 
> changed.
> Specifically I think below build in metadata should fall into the logical 
> properties category -
> Selectivity
> UniqueKeys
> ColumnUniqueness
> RowCount
> MaxRowCount
> MinRowCount
> DistinctRowCount
> Size (averageRowSize, averageColumnSize)
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to