infoverload commented on a change in pull request #17260:
URL: https://github.com/apache/flink/pull/17260#discussion_r707412562



##########
File path: docs/content/docs/dev/table/concepts/overview.md
##########
@@ -32,6 +32,79 @@ This means that Table API and SQL queries have the same 
semantics regardless whe
 
 The following pages explain concepts, practical limitations, and 
stream-specific configuration parameters of Flink's relational APIs on 
streaming data.
 
+State Management
+----------------
+
+Table programs that run in streaming mode leverage all capabilities of Flink 
as a stateful stream
+processor.
+
+In particular, a table program can be configured with a [state backend]({{< 
ref "docs/ops/state/state_backends" >}})
+and various [checkpointing options]({{< ref 
"docs/dev/datastream/fault-tolerance/checkpointing" >}})
+for handling large amounts of state and fault tolerance. It is possible to 
take a savepoint of a running
+Table API & SQL pipeline and to restore the application's state at later point 
in time.
+
+### State Usage
+
+Due to the declarative nature of Table API & SQL program, it is not always 
obvious where and how much
+state is used within a table pipeline. The planner decides about when state is 
necessary to compute a correct
+result. A pipeline is optimized to claim as little state as possible given the 
current set of optimizer
+rules.
+
+{{< hint info >}}
+Source tables are never kept entirely in state. This depends on the used 
operations.
+{{< /hint >}}
+
+Simple `SELECT ... FROM ... WHERE` queries that only consist of field 
projections or filters are usually
+stateless pipelines. However, operations such as joins, aggregations, or 
deduplications require to keep
+intermediate results in a fault tolerant storage for which Flink's state 
abstractions are used.

Review comment:
       ```suggestion
   Queries such as `SELECT ... FROM ... WHERE` that only consist of field 
projections or filters are usually
   stateless pipelines. However, operations such as joins, aggregations, or 
deduplications require keeping
   intermediate results in a fault-tolerant storage for which Flink's state 
abstractions are used.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to