[
https://issues.apache.org/jira/browse/FLINK-5566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15864062#comment-15864062
]
ASF GitHub Bot commented on FLINK-5566:
---------------------------------------
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/3196#discussion_r100851670
--- Diff:
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/stats/ColumnStats.scala
---
@@ -0,0 +1,52 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.plan.stats
+
+/**
+ * column statistics
+ *
+ * @param ndv number of distinct values
+ * @param nullCount number of nulls
+ * @param avgLen average length of column values
+ * @param maxLen max length of column values
+ * @param max max value of column values
+ * @param min min value of column values
+ */
+case class ColumnStats(
+ ndv: Long,
+ nullCount: Long,
+ avgLen: Long,
--- End diff --
I think `Int` should be sufficient for value length.
> Introduce structure to hold table and column level statistics
> -------------------------------------------------------------
>
> Key: FLINK-5566
> URL: https://issues.apache.org/jira/browse/FLINK-5566
> Project: Flink
> Issue Type: Sub-task
> Components: Table API & SQL
> Reporter: Kurt Young
> Assignee: zhangjing
>
> We define two structure mode to hold statistics
> 1. TableStats: contain stats for table level, now only one element: rowCount
> 2. ColumnStats: contain stats of column level.
> for numeric column type: including ndv, nullCount, max, min, histogram
> for string type: including ndv, nullCount, avgLen,maxLen
> for boolean:including ndv, nullCount, trueCount, falseCount
> for date/time/timestamp: including ndv, nullCount, max, min, histogram
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)