[ 
https://issues.apache.org/jira/browse/CALCITE-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17155793#comment-17155793
 ] 

Xiening Dai commented on CALCITE-3963:
--------------------------------------

{quote}
If one RelNode thinks a query returns at least 1 row, and another equivalent 
RelNode thinks that a query returns at least 2 rows, then the query must return 
at least 2 rows.
{quote}

The two RelNode are equivelent and the minRowCount are calculated independently 
by each one them. The minRowCount definition is for the RelNode, not for the 
RelSet or RelSubset. How would it make sense to aggregate them?

{quote}
It is foolish to regard any RelNode as inherently high or low confidence based 
on its type.
{quote}

It doesn't have to be solely based on type. With current proposal,  
getEstimateConfidence() could be implemented to account for the confidence 
level of inputs as well. I am just not doing it right now as I don't see a real 
need here. MultiJoin is treated as low confidence because it doesn't implement 
proper stats estimate currently. 

The concept of confidence is implemented in a simple way in this proposal, but 
surely can be extended when necessary. I am not saying current proposal is 
perfect. And I don't think we should expect perfect code. My philosophy is that 
we should favor commits as long as they improves the current state of code 
base. 

> Maintain logical properties at RelSet (equivalent group) instead of RelNode
> ---------------------------------------------------------------------------
>
>                 Key: CALCITE-3963
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3963
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Xiening Dai
>            Assignee: Xiening Dai
>            Priority: Major
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently the logical properties (such as row count, distinct row count, etc) 
> are maintained at RelNode level. This creates a number of meta data 
> consistency problems, e.g. CALCITE-1048, CALCITE-2166. 
> In theory, all RelNodes in a RelSet should share the same logical properties 
> per definition of relational equivalence. So it makes more sense to keep 
> logical properties at RelSet level, rather than the RelNode. And such 
> properties shouldn't change when new sub set is created or subset's best is 
> changed.
> Specifically I think below build in metadata should fall into the logical 
> properties category -
> Selectivity
> UniqueKeys
> ColumnUniqueness
> RowCount
> MaxRowCount
> MinRowCount
> DistinctRowCount
> Size (averageRowSize, averageColumnSize)
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to