[ 
https://issues.apache.org/jira/browse/CALCITE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17192359#comment-17192359
 ] 

Julian Hyde commented on CALCITE-4223:
--------------------------------------

I don't think we should extend {{interface RelOptTable}} or {{interface 
Statistics}}. There are a few reasons.

First, {{RelOptTable}} already extends {{interface Wrapper}}. If you want your 
implementation of {{RelOptTable}} to provide some other interface, you can get 
it via {{Wrapper.unwrap()}}.

Second, there is no fixed definition of statistics. Different engines are going 
to have different variations. One engine might have a histogram that tells you 
the number of distinct values of productId when state = 'CA'; another might 
not. That's why we created the {{RelMetadataQuery}} framework.

Third and fourth, we will need caching, and a way to plug in multiple 
providers. Again, that's what the {{RelMetadataQuery}} framework is for.

Why not write a convenience method

{code}
  double getNdv(RelMetadataQuery mq, RelOptTable table, int column) {
    return mq.getPopulationSize(new TableScan(table), 
ImmutableBitSet.of(column));
{code}


> Introducing column statistics to RelOptTable
> --------------------------------------------
>
>                 Key: CALCITE-4223
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4223
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Chunwei Lei
>            Assignee: Chunwei Lei
>            Priority: Major
>
> Many systems depend on column statistics to compute more accurate stats, such 
> as NDV, average column size, and so on. It would be nice if Calcite can 
> provide such an interface.
> Column statistics might include NDV, average/max column length, number of 
> nulls, number of trues, number of falses and so on. 
> What do you think?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to