[ 
https://issues.apache.org/jira/browse/CALCITE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120385#comment-17120385
 ] 

Haisheng Yuan commented on CALCITE-3963:
----------------------------------------

4 values in confidence level is indeed less, we can extend it to be integer, 
but I believe many people may just leave it as the default value. If we 
frequently change the rule execution order, and the alternatives have the same 
`stats confidence level`, in which case it may be non-deterministic, yet I 
think the same issue happens on current Calcite too.

Regarding the folding operations, I am not sure how do we fold the cardinality, 
histogram, most common values, most common frequencies, constraint properties. 
After constant reduction, the selectivity for OR expressions might become 
smaller, yet selectivity for AND expressions may become larger. Should we 
choose min or max operation for selectivity, cardinality?

Using folding operations, do we still need to compute the logical properties 
for all the MergeJoins, HashJoins, NestedLoop joins with different distribution 
policy that are all generated from a single LogicalJoin?

 

> Maintain logical properties at RelSet (equivalent group) instead of RelNode
> ---------------------------------------------------------------------------
>
>                 Key: CALCITE-3963
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3963
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Xiening Dai
>            Assignee: Xiening Dai
>            Priority: Major
>
> Currently the logical properties (such as row count, distinct row count, etc) 
> are maintained at RelNode level. This creates a number of meta data 
> consistency problems, e.g. CALCITE-1048, CALCITE-2166. 
> In theory, all RelNodes in a RelSet should share the same logical properties 
> per definition of relational equivalence. So it makes more sense to keep 
> logical properties at RelSet level, rather than the RelNode. And such 
> properties shouldn't change when new sub set is created or subset's best is 
> changed.
> Specifically I think below build in metadata should fall into the logical 
> properties category -
> Selectivity
> UniqueKeys
> ColumnUniqueness
> RowCount
> MaxRowCount
> MinRowCount
> DistinctRowCount
> Size (averageRowSize, averageColumnSize)
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to