Myasuka commented on code in PR #24975:
URL: https://github.com/apache/flink/pull/24975#discussion_r1663472146


##########
docs/content/docs/dev/table/materialized-table/statements.md:
##########
@@ -0,0 +1,344 @@
+---
+title: Statements
+weight: 2
+type: docs
+aliases:
+- /dev/table/materialized-table/statements.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Description
+
+Flink SQL supports the following Materialized Table statements for now:
+- [CREATE MATERIALIZED TABLE](#create-materialized-table)
+- [Alter MATERIALIZED TABLE](#alter-materialized-table)
+- [DROP MATERIALIZED TABLE](#drop-materialized-table)
+
+# CREATE MATERIALIZED TABLE
+
+```
+CREATE MATERIALIZED TABLE [catalog_name.][db_name.]table_name
+
+[ ([ <table_constraint> ]) ]
+
+[COMMENT table_comment]
+
+[PARTITIONED BY (partition_column_name1, partition_column_name2, ...)]
+
+[WITH (key1=val1, key2=val2, ...)]
+
+FRESHNESS = INTERVAL '<num>' { SECOND | MINUTE | HOUR | DAY }
+
+[REFRESH_MODE = { CONTINUOUS | FULL }]
+
+AS <select_statement>
+
+<table_constraint>:
+  [CONSTRAINT constraint_name] PRIMARY KEY (column_name, ...) NOT ENFORCED
+```
+
+### PRIMARY KEY
+
+PRIMARY KEY defines an optional list of columns that uniquely identifies each 
row within the table. The column as the primary key must be non-null.
+
+### PARTITIONED BY
+
+PARTITIONED BY define an optional list of columns to partition the 
materialized table. A directory is created for each partition if this 
materialized table is used as a filesystem sink.
+
+**Example:**
+
+```sql
+-- Create a materialized table and specify the partition field as `ds`.
+CREATE MATERIALIZED TABLE my_materialized_table
+    PARTITIONED BY (ds)
+    FRESHNESS = INTERVAL '1' HOUR
+    AS SELECT 
+        ds
+    FROM
+        ...
+```
+
+<span class="label label-danger">Note</span>
+- The partition column must be included in the query statement of the 
materialized table.
+
+### WITH Options
+
+WITH Options are used to specify the materialized table properties, including 
[connector options]({{< ref "docs/connectors/table/" >}}) and [time format 
option]({{< ref "docs/dev/table/config" >}}#partition-fields-date-formatter) 
for partition fields.
+
+```sql
+-- Create a materialized table, specify the partition field as 'ds', and the 
corresponding time format as 'yyyy-MM-dd'
+CREATE MATERIALIZED TABLE my_materialized_table
+    PARTITIONED BY (ds)
+    WITH (
+        'format' = 'json',
+        'partition.fields.ds.date-formatter' = 'yyyy-MM-dd'
+    )
+    ...
+```
+
+As shown in the above example, we specified the date-formatter option for the 
ds partition column. During each scheduling, the scheduling time will be 
converted to the ds partition value. For example, for a scheduling time of 
2024-01-01 00:00:00, only the partition ds = '2024-01-01' will be refreshed.
+
+<span class="label label-danger">Note</span>
+- The `partition.fields.#.date-formatter` option only works in full mode.
+- The field in the [partition.fields.#.date-formatter]({{< ref 
"docs/dev/table/config" >}}#partition-fields-date-formatter) must be a valid 
string type partition field.
+
+### FRESHNESS
+
+**FRESHNESS Definition and Refresh Mode Relationship**
+
+FRESHNESS defines the maximum amount of time that the materialized table’s 
content should lag behind updates to the base tables. It does two things, 
firstly it determines the [refresh mode]({{< ref 
"docs/dev/table/materialized-table/overview" >}}#refresh-mode) of the 
materialized table through [configuration]({{< ref "docs/dev/table/config" 
>}}#materialized-table-refresh-mode-freshness-threshold), followed by 
determines the data refresh frequency to meet the actual data freshness 
requirements.
+
+**Detailed Explanation of FRESHNESS Parameter**
+
+The FRESHNESS parameter range is INTERVAL `'<num>'` { SECOND | MINUTE | HOUR | 
DAY }. `'<num>'` must be a positive integer, and in FULL mode, `'<num>'` should 
be a common divisor of the respective time interval.
+
+**Examples:**
+(Assuming `materialized-table.refresh-mode.freshness-threshold` is 30 minutes)
+
+```sql
+-- The corresponding refresh pipeline is a streaming job with a checkpoint 
interval of 1 second
+FRESHNESS = INTERVAL '1' SECOND

Review Comment:
   Since current checkpoint interval is bounded to the settings of `freshness`, 
I think we should warn users that the stronger freshness would introduce more 
impact to the checkpoint, we can tune the freshness longer and tell users to 
consider [changelog 
state-backend](https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/state_backends/#enabling-changelog).



##########
docs/content/docs/dev/table/materialized-table/overview.md:
##########
@@ -0,0 +1,65 @@
+---
+title: Overview
+weight: 1
+type: docs
+aliases:
+- /dev/table/materialized-table.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Introduction
+
+Materialized Table is a new table type introduced in Flink SQL, aimed at 
simplifying both batch and stream data pipelines, providing a consistent 
development experience. By specifying data freshness and query when creating 
Materialized, the engine automatically derives the schema for the materialized 
table and creates corresponding data refresh pipeline to achieve the specified 
freshness.
+
+{{< hint warning >}}
+Note: This feature is currently an MVP (“minimum viable product”) feature and 
only available to [SQL Gateway]({{< ref "docs/dev/table/sql-gateway/overview" 
>}}) and [Standalone]({{< ref 
"docs/deployment/resource-providers/standalone/overview" >}}) cluster.

Review Comment:
   I think this statement might mistake users, I think we could improve to 
   ```suggestion
   Note: This feature is currently an MVP (“minimum viable product”) feature 
and only available within [SQL Gateway]({{< ref 
"docs/dev/table/sql-gateway/overview" >}}) which connected to a 
[Standalone]({{< ref "docs/deployment/resource-providers/standalone/overview" 
>}}) deployed Flink cluster.
   ```
   
   Same for the Chinese part.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to