infoverload commented on a change in pull request #17260:
URL: https://github.com/apache/flink/pull/17260#discussion_r708189420
##########
File path: docs/content/docs/dev/table/concepts/overview.md
##########
@@ -32,6 +32,82 @@ This means that Table API and SQL queries have the same
semantics regardless whe
The following pages explain concepts, practical limitations, and
stream-specific configuration parameters of Flink's relational APIs on
streaming data.
+State Management
+----------------
+
+Table programs that run in streaming mode leverage all capabilities of Flink
as a stateful stream
+processor.
+
+In particular, a table program can be configured with a [state backend]({{<
ref "docs/ops/state/state_backends" >}})
+and various [checkpointing options]({{< ref
"docs/dev/datastream/fault-tolerance/checkpointing" >}})
+for handling different requirements regarding state size and fault tolerance.
It is possible to take
+a savepoint of a running Table API & SQL pipeline and to restore the
application's state at a later
+point in time.
+
+### State Usage
+
+Due to the declarative nature of Table API & SQL program, it is not always
obvious where and how much
+state is used within a pipeline. The planner decides whether state is
necessary to compute a correct
+result. A pipeline is optimized to claim as little state as possible given the
current set of optimizer
+rules.
+
+{{< hint info >}}
+Conceptually, source tables are never kept entirely in state. An implementer
deals with logical tables
+(i.e. [dynamic tables]({{< ref "docs/dev/table/concepts/dynamic_tables" >}})).
Their state requirements
+depend on the used operations.
+{{< /hint >}}
+
+Queries such as `SELECT ... FROM ... WHERE` queries that only consist of field
projections or filters are usually
+stateless pipelines. However, operations such as joins, aggregations, or
deduplications require to keep
+intermediate results in a fault tolerant storage for which Flink's state
abstractions are used.
+
+{{< hint info >}}
+Please refer to the individual operator documentation for more details about
how much state is required
+and how to limit a potentially ever growing state size.
+{{< /hint >}}
+
+For example, a regular SQL join of two tables requires the operator to keep
both input tables in state
+entirely. For correct SQL semantics, the runtime needs to assume that a
matching could occur at any
+point in time from both sides. Flink provides [optimized window and interval
joins]({{< ref "docs/dev/table/sql/queries/joins" >}})
+that aim to keep the state size small by exploiting the concept of
[watermarks]({{< ref "docs/dev/table/concepts/time_attributes" >}}).
+
+### Stateful Upgrades and Evolution
+
+Table programs that are executed in streaming mode are intended as *standing
queries* that statically
+define an end-to-end pipeline.
Review comment:
```suggestion
Table programs that are executed in streaming mode are intended as *standing
queries* (defined once and then executed continuously) that statically define
an end-to-end pipeline.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]