[ 
https://issues.apache.org/jira/browse/CALCITE-4223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190493#comment-17190493
 ] 

Chunwei Lei commented on CALCITE-4223:
--------------------------------------

[~amaliujia] , column stats are used in Hive[1], Flink[2], Spark[3].

[1] 
[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/ColStatistics.java]

[2][https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/plan/stats/ColumnStats.java]

[[3] 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/ColumnStats.scala|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/ColumnStats.scala]

> Introducing column statistics to RelOptTable
> --------------------------------------------
>
>                 Key: CALCITE-4223
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4223
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Chunwei Lei
>            Assignee: Chunwei Lei
>            Priority: Major
>
> Many systems depend on column statistics to compute more accurate stats, such 
> as NDV, average column size, and so on. It would be nice if Calcite can 
> provide such an interface.
> Column statistics might include NDV, average/max column length, number of 
> nulls, number of trues, number of falses and so on. 
> What do you think?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to