[ 
https://issues.apache.org/jira/browse/CALCITE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192601#comment-17192601
 ] 

Chunwei Lei commented on CALCITE-4223:
--------------------------------------

There is always an alternative due to the flexible interface. But I think it 
would be great if Calcite has column statistics since it helps generate a 
better plan.

Even if the column statistics are provided, we still get the statistics like 
NDV using {{RelMetadataQuery}}. What we need to do is add/modify some methods 
in RelMdxxx. Taking NDV for example:

 
{code:java}
// RelMdDistinctRowCount.java

public Double getDistinctRowCount(TableScan rel,
    RelMetadataQuery mq, ImmutableBitSet groupKey,
    RexNode predicate) {
    ......    
    List<Double> allDistinctValue = 
getAllDistinctValue(rel.getTable().getColumnStatistics(), groupKey); 
    return getJointDistinctValue(allDistinctValue);
}{code}
 

Regarding no fixed definition of statistics, we can provide some basic and 
frequently-used column statistics including NDV/AverageColumnSize/nullCount. 

 

Let me conclude to help others understand more. There are two questions:

1) Do you agree to introduce column statistics?

2) If so, where should we put them? (RelOptTable? Statistics? Or other places)

> Introducing column statistics to RelOptTable
> --------------------------------------------
>
>                 Key: CALCITE-4223
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4223
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Chunwei Lei
>            Assignee: Chunwei Lei
>            Priority: Major
>
> Many systems depend on column statistics to compute more accurate stats, such 
> as NDV, average column size, and so on. It would be nice if Calcite can 
> provide such an interface.
> Column statistics might include NDV, average/max column length, number of 
> nulls, number of trues, number of falses and so on. 
> What do you think?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to