[
https://issues.apache.org/jira/browse/CALCITE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198105#comment-17198105
]
Chunwei Lei commented on CALCITE-4223:
--------------------------------------
Thank you for your review, Julian.
{quote}it does not easily allow people to add new kinds of metadata, and it
does not accommodate differences in data structures that may have more
information (e.g. a system that has a histogram that returns not just number of
distinct values, but the number of distinct values between 100 and 1000)
{quote}
Users can implement ColStatistics and add new methods. Then use unwarp() to get
the customized ColStatistics. Comparing
{{table.unwrap(BuiltinMetadata.Size.class)}}, I think
sum({{table.getColumnStatistics(col).getAvgColLen())}} is much more
straightforward and readable.
Besides, does it mean that {{RelOptTable}} has to implement interfaces like
{{BuiltinMetadata.size/BuiltinMetadata.DistinctRowCount}} in your proposal?
> Introducing column statistics to RelOptTable
> --------------------------------------------
>
> Key: CALCITE-4223
> URL: https://issues.apache.org/jira/browse/CALCITE-4223
> Project: Calcite
> Issue Type: Improvement
> Reporter: Chunwei Lei
> Assignee: Chunwei Lei
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Many systems depend on column statistics to compute more accurate stats, such
> as NDV, average column size, and so on. It would be nice if Calcite can
> provide such an interface.
> Column statistics might include NDV, average/max column length, number of
> nulls, number of trues, number of falses and so on.
> What do you think?
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)