xuyangzhong commented on code in PR #24198:
URL: https://github.com/apache/flink/pull/24198#discussion_r1467276024


##########
docs/content/docs/dev/table/sql/queries/hints.md:
##########
@@ -597,6 +597,131 @@ SELECT /*+ BROADCAST(t1) SHUFFLE_HASH(t1) */ * FROM t1 
FULL OUTER JOIN t2 ON t1.
 SELECT /*+ BROADCAST(t1) SHUFFLE_HASH(t1) */ * FROM t1 FULL OUTER JOIN t2 ON 
t1.id > t2.id;
 ```
 
+### State TTL Hints
+
+{{< label Streaming >}}
+
+For stateful computation [Regular Join]({{< ref 
"docs/dev/table/sql/queries/joins" >}}#regular-joins)
+and [Group Aggregation]({{< ref "docs/dev/table/sql/queries/group-agg" >}}), 
users can
+use `STATE_TTL` hint to
+specify operator-level [Idle State Retention Time]({{< ref 
"docs/dev/table/concepts/overview" >}}#idle-state-retention-time),
+which enables the aforementioned operators to have a different TTL against the 
pipeline level configuration [table.exec.state.ttl]({{< ref 
"docs/dev/table/config" >}}#table-exec-state-ttl).
+
+##### Regular Join Examples
+
+```sql
+CREATE TABLE orders (
+  o_orderkey INT,
+  o_custkey INT,
+  o_status BOOLEAN,
+  o_totalprice DOUBLE
+) WITH (...);
+
+CREATE TABLE lineitem (
+  l_linenumber int,
+  l_orderkey int,
+  l_partkey int,
+  l_extendedprice double
+) WITH (...);
+
+CREATE TABLE customers (
+  c_custkey int,
+  c_address string
+) WITH (...);
+
+-- table name as hint key
+SELECT /*+ STATE_TTL('orders'='3d', 'lineitem'='1d') */ * FROM
+orders LEFT JOIN lineitem
+ON orders.o_orderkey = lineitem.l_orderkey;
+
+
+-- table alias as hint key
+SELECT /*+ STATE_TTL('o'='3d', 'l'='1d') */ * FROM
+orders o LEFT JOIN lineitem l
+ON o.o_orderkey = l.l_orderkey;
+
+-- temporary view name as hint key
+CREATE TEMPORARY VIEW left_input AS SELECT ... FROM orders WHERE ...;
+CREATE TEMPORARY VIEW right_input AS SELECT ... FROM lineitem WHERE ...;
+SELECT /*+ STATE_TTL('left_input'= '360000s', 'right_input' = '15h') */ * 
+FROM left_input JOIN right_input
+ON left_input.join_key = right_input.join_key;
+
+-- cascade joins
+SELECT /*+ STATE_TTL('o' = '3d', 'l' = '1d', 'c' = '10d') */ *
+FROM orders o LEFT OUTER JOIN lineitem l
+ON o.o_orderkey = l.l_orderkey
+LEFT OUTER JOIN customers c
+ON o.o_custkey = c.c_custkey;
+```
+
+##### Group Aggregation Examples
+
+```sql
+-- table name as hint key
+SELECT /*+ STATE_TTL('orders' = '1d') */ o_orderkey, SUM(o_totalprice) AS 
revenue
+FROM orders
+GROUP BY o_orderkey;
+
+-- table alias as hint key
+SELECT /*+ STATE_TTL('o' = '1d') */ o_orderkey, SUM(o_totalprice) AS revenue
+FROM orders AS o
+GROUP BY o_orderkey;
+
+-- query block alias as hint key
+SELECT /*+ STATE_TTL('tmp' = '1d') */ o_orderkey, SUM(o_totalprice) AS revenue
+FROM (SELECT o_orderkey, o_totalprice
+      FROM orders
+      WHERE o_shippriority = 0) tmp
+GROUP BY o_orderkey;
+```
+
+{{< hint info >}}
+Note:
+
+- Users can choose either table/view name or table alias as the hint key. 
However, once the alias is specified, the `STATE_TTL` must be hinted on the 
alias.
+- For cascade joins, the specified state TTLs will be interpreted as the left 
and right state TTL for the first join operator and 
+  the right state TTL for the second join operator (from a bottom-up order). 
+  The left state TTL for the second join operator will be retrieved from the 
configuration `table.exec.state.ttl`. 
+  If users need to set a specific TTL value for the left state of the second 
join operator, the query needs to be split into query blocks like 
+  ```sql
+  CREATE TEMPORARY VIEW V AS 
+  SELECT /*+ STATE_TTL('A' = '${ttl_A}', 'B' = '${ttl_B}')*/ * FROM A JOIN B 
ON...;
+  SELECT /*+ STATE_TTL('V' = '${ttl_V}', 'C' = '${ttl_C}')*/ * FROM V JOIN C 
ON ...;
+  ```
+- STATE_TTL hint only applies on the underlying query block.
+  {{< /hint >}}
+
+#### Different ways to configure state TTL for SQL pipeline

Review Comment:
   I feel placing this content here is somewhat awkward. How about moving it to 
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/overview/#configure-operator-level-state-ttl
 , and then adding a link at the beginning where `State TTL Hints` is 
introduced?



##########
docs/content.zh/docs/dev/table/sql/queries/hints.md:
##########
@@ -549,6 +549,122 @@ SELECT /*+ BROADCAST(t1) SHUFFLE_HASH(t1) */ * FROM t1 
FULL OUTER JOIN t2 ON t1.
 -- 由于指定的两种联接提示都不支持不等值的联接条件。所以,只能使用支持非等值联接条件的 nested loop join。
 SELECT /*+ BROADCAST(t1) SHUFFLE_HASH(t1) */ * FROM t1 FULL OUTER JOIN t2 ON 
t1.id > t2.id;
 ```
+### 状态生命周期提示
+
+{{< label Streaming >}}
+
+对于有状态计算的[流连接]({{< ref "docs/dev/table/sql/queries/joins" 
>}}#regular-joins)和[分组聚合]({{< ref "docs/dev/table/sql/queries/group-agg" 
>}})操作,用户可以通过 `STATE_TTL` 来指定算子粒度的[空闲状态维持时间]({{< ref 
"docs/dev/table/concepts/overview" 
>}}#idle-state-retention-time),该方式能够使得在上述状态算子中使用与 [table.exec.state.ttl]({{< 
ref "docs/dev/table/config" >}}#table-exec-state-ttl) 不同的值。
+
+##### 流连接示例
+
+```sql
+CREATE TABLE orders (
+  o_orderkey INT,
+  o_custkey INT,
+  o_status BOOLEAN,
+  o_totalprice DOUBLE
+) WITH (...);
+
+CREATE TABLE lineitem (
+  l_linenumber int,
+  l_orderkey int,
+  l_partkey int,
+  l_extendedprice double
+) WITH (...);
+
+CREATE TABLE customers (
+  c_custkey int,
+  c_address string
+) WITH (...);
+
+-- 表名作为 hint 键
+SELECT /*+ STATE_TTL('orders'='3d', 'lineitem'='1d') */ * FROM
+orders LEFT JOIN lineitem
+ON orders.o_orderkey = lineitem.l_orderkey;
+
+
+-- 别名作为 hint 键
+SELECT /*+ STATE_TTL('o'='3d', 'l'='1d') */ * FROM
+orders o LEFT JOIN lineitem l
+ON o.o_orderkey = l.l_orderkey;
+
+-- 临时视图作为 hint 键
+CREATE TEMPORARY VIEW left_input AS SELECT ... FROM orders WHERE ...;
+CREATE TEMPORARY VIEW right_input AS SELECT ... FROM lineitem WHERE ...;
+SELECT /*+ STATE_TTL('left_input'= '360000s', 'right_input' = '15h') */ * 
+FROM left_input JOIN right_input
+ON left_input.join_key = right_input.join_key;
+
+-- 级联 join
+SELECT /*+ STATE_TTL('o' = '3d', 'l' = '1d', 'c' = '10d') */ *
+FROM orders o LEFT OUTER JOIN lineitem l
+ON o.o_orderkey = l.l_orderkey
+LEFT OUTER JOIN customers c
+ON o.o_custkey = c.c_custkey;
+```
+
+##### 分组聚合示例
+
+```sql
+-- 表名作为 hint 键
+SELECT /*+ STATE_TTL('orders' = '1d') */ o_orderkey, SUM(o_totalprice) AS 
revenue
+FROM orders
+GROUP BY o_orderkey;
+
+-- 别名作为 hint 键
+SELECT /*+ STATE_TTL('o' = '1d') */ o_orderkey, SUM(o_totalprice) AS revenue
+FROM orders AS o
+GROUP BY o_orderkey;
+
+-- 查询块作为 hint 键
+SELECT /*+ STATE_TTL('tmp' = '1d') */ o_orderkey, SUM(o_totalprice) AS revenue
+FROM (SELECT o_orderkey, o_totalprice
+      FROM orders
+      WHERE o_shippriority = 0) tmp
+GROUP BY o_orderkey;
+```
+
+{{< hint info >}}
+注意:
+
+- 用户既可以选择表(或视图)名也可以选择别名作为提示键,但在指定别名时需要使用别名。
+- 
对于多流连接场景,直接指定每张表的生命周期只会在第一个连接算子的左右流和第二个连接算子的右流上生效(因为流上关联操作是二元的)。如果想为每个连接算子的左右流都指定不同生命周期,需要将查询拆成多个查询块,如下所示。
+  ```sql
+  CREATE TEMPORARY VIEW V AS 
+  SELECT /*+ STATE_TTL('A' = '${ttl_A}', 'B' = '${ttl_B}')*/ * FROM A JOIN B 
ON...;
+  SELECT /*+ STATE_TTL('V' = '${ttl_V}', 'C' = '${ttl_C}')*/ * FROM V JOIN C 
ON ...;
+  ```
+- STATE_TTL 提示仅作用在当前查询块上。
+  {{< /hint >}}

Review Comment:
   Nit: remove leading spaces



##########
docs/content.zh/docs/dev/table/sql/queries/hints.md:
##########
@@ -549,6 +549,122 @@ SELECT /*+ BROADCAST(t1) SHUFFLE_HASH(t1) */ * FROM t1 
FULL OUTER JOIN t2 ON t1.
 -- 由于指定的两种联接提示都不支持不等值的联接条件。所以,只能使用支持非等值联接条件的 nested loop join。
 SELECT /*+ BROADCAST(t1) SHUFFLE_HASH(t1) */ * FROM t1 FULL OUTER JOIN t2 ON 
t1.id > t2.id;
 ```
+### 状态生命周期提示
+
+{{< label Streaming >}}
+
+对于有状态计算的[流连接]({{< ref "docs/dev/table/sql/queries/joins" 
>}}#regular-joins)和[分组聚合]({{< ref "docs/dev/table/sql/queries/group-agg" 
>}})操作,用户可以通过 `STATE_TTL` 来指定算子粒度的[空闲状态维持时间]({{< ref 
"docs/dev/table/concepts/overview" 
>}}#idle-state-retention-time),该方式能够使得在上述状态算子中使用与 [table.exec.state.ttl]({{< 
ref "docs/dev/table/config" >}}#table-exec-state-ttl) 不同的值。

Review Comment:
   What about adding `与作业级别[table.exec.state.ttl](xxx)不同的值` in the last 
sentence?



##########
docs/content.zh/docs/dev/table/sql/queries/hints.md:
##########
@@ -549,6 +549,122 @@ SELECT /*+ BROADCAST(t1) SHUFFLE_HASH(t1) */ * FROM t1 
FULL OUTER JOIN t2 ON t1.
 -- 由于指定的两种联接提示都不支持不等值的联接条件。所以,只能使用支持非等值联接条件的 nested loop join。
 SELECT /*+ BROADCAST(t1) SHUFFLE_HASH(t1) */ * FROM t1 FULL OUTER JOIN t2 ON 
t1.id > t2.id;
 ```
+### 状态生命周期提示
+
+{{< label Streaming >}}
+
+对于有状态计算的[流连接]({{< ref "docs/dev/table/sql/queries/joins" 
>}}#regular-joins)和[分组聚合]({{< ref "docs/dev/table/sql/queries/group-agg" 
>}})操作,用户可以通过 `STATE_TTL` 来指定算子粒度的[空闲状态维持时间]({{< ref 
"docs/dev/table/concepts/overview" 
>}}#idle-state-retention-time),该方式能够使得在上述状态算子中使用与 [table.exec.state.ttl]({{< 
ref "docs/dev/table/config" >}}#table-exec-state-ttl) 不同的值。
+
+##### 流连接示例
+
+```sql
+CREATE TABLE orders (
+  o_orderkey INT,
+  o_custkey INT,
+  o_status BOOLEAN,
+  o_totalprice DOUBLE
+) WITH (...);
+
+CREATE TABLE lineitem (
+  l_linenumber int,
+  l_orderkey int,
+  l_partkey int,
+  l_extendedprice double
+) WITH (...);
+
+CREATE TABLE customers (
+  c_custkey int,
+  c_address string
+) WITH (...);
+
+-- 表名作为 hint 键
+SELECT /*+ STATE_TTL('orders'='3d', 'lineitem'='1d') */ * FROM
+orders LEFT JOIN lineitem
+ON orders.o_orderkey = lineitem.l_orderkey;
+
+
+-- 别名作为 hint 键
+SELECT /*+ STATE_TTL('o'='3d', 'l'='1d') */ * FROM
+orders o LEFT JOIN lineitem l
+ON o.o_orderkey = l.l_orderkey;
+
+-- 临时视图作为 hint 键
+CREATE TEMPORARY VIEW left_input AS SELECT ... FROM orders WHERE ...;
+CREATE TEMPORARY VIEW right_input AS SELECT ... FROM lineitem WHERE ...;
+SELECT /*+ STATE_TTL('left_input'= '360000s', 'right_input' = '15h') */ * 
+FROM left_input JOIN right_input
+ON left_input.join_key = right_input.join_key;
+
+-- 级联 join
+SELECT /*+ STATE_TTL('o' = '3d', 'l' = '1d', 'c' = '10d') */ *
+FROM orders o LEFT OUTER JOIN lineitem l
+ON o.o_orderkey = l.l_orderkey
+LEFT OUTER JOIN customers c
+ON o.o_custkey = c.c_custkey;
+```
+
+##### 分组聚合示例
+
+```sql
+-- 表名作为 hint 键
+SELECT /*+ STATE_TTL('orders' = '1d') */ o_orderkey, SUM(o_totalprice) AS 
revenue
+FROM orders
+GROUP BY o_orderkey;
+
+-- 别名作为 hint 键
+SELECT /*+ STATE_TTL('o' = '1d') */ o_orderkey, SUM(o_totalprice) AS revenue
+FROM orders AS o
+GROUP BY o_orderkey;
+
+-- 查询块作为 hint 键
+SELECT /*+ STATE_TTL('tmp' = '1d') */ o_orderkey, SUM(o_totalprice) AS revenue
+FROM (SELECT o_orderkey, o_totalprice
+      FROM orders
+      WHERE o_shippriority = 0) tmp
+GROUP BY o_orderkey;
+```
+
+{{< hint info >}}
+注意:
+
+- 用户既可以选择表(或视图)名也可以选择别名作为提示键,但在指定别名时需要使用别名。
+- 
对于多流连接场景,直接指定每张表的生命周期只会在第一个连接算子的左右流和第二个连接算子的右流上生效(因为流上关联操作是二元的)。如果想为每个连接算子的左右流都指定不同生命周期,需要将查询拆成多个查询块,如下所示。
+  ```sql
+  CREATE TEMPORARY VIEW V AS 
+  SELECT /*+ STATE_TTL('A' = '${ttl_A}', 'B' = '${ttl_B}')*/ * FROM A JOIN B 
ON...;

Review Comment:
   What about replacing these `${ttl_X}` with specific ttl example?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to