This is an automated email from the ASF dual-hosted git repository.
tangyun pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/flink.git
The following commit(s) were added to refs/heads/master by this push:
new 19f4239ab34 [FLINK-27327][docs] Add description about changing max
parallelism explicitly leads to state incompatibility
19f4239ab34 is described below
commit 19f4239ab34a232841484fd6a258afd336f942a1
Author: Hangxiang Yu <[email protected]>
AuthorDate: Thu Apr 28 15:06:43 2022 +0800
[FLINK-27327][docs] Add description about changing max parallelism
explicitly leads to state incompatibility
---
docs/content.zh/docs/dev/datastream/execution/parallel.md | 5 ++++-
docs/content/docs/dev/datastream/execution/parallel.md | 2 ++
docs/layouts/shortcodes/generated/pipeline_configuration.html | 2 +-
.../main/java/org/apache/flink/configuration/PipelineOptions.java | 3 ++-
4 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/docs/content.zh/docs/dev/datastream/execution/parallel.md
b/docs/content.zh/docs/dev/datastream/execution/parallel.md
index 4999c47d858..216dea0281c 100644
--- a/docs/content.zh/docs/dev/datastream/execution/parallel.md
+++ b/docs/content.zh/docs/dev/datastream/execution/parallel.md
@@ -215,7 +215,10 @@ Python API 中尚不支持该特性。
默认的最大并行度等于将 `operatorParallelism + (operatorParallelism / 2)`
值四舍五入到大于等于该值的一个整型值,并且这个整型值是 `2` 的幂次方,注意默认最大并行度下限为 `128`,上限为 `32768`。
-<span class="label label-danger">注意</span> 为最大并行度设置一个非常大的值将会降低性能,因为一些 state
backends 需要维持内部的数据结构,而这些数据结构将会随着 key-groups 的数目而扩张(key-group 是状态重新分配的最小单元)。
+{{< hint warning >}}
+为最大并行度设置一个非常大的值将会降低性能,因为一些 state backends 需要维持内部的数据结构,而这些数据结构将会随着 key-groups
的数目而扩张(key-group 是状态重新分配的最小单元)。
+从之前的作业恢复时,改变该作业的最大并发度将会导致状态不兼容。
+{{< /hint >}}
{{< top >}}
diff --git a/docs/content/docs/dev/datastream/execution/parallel.md
b/docs/content/docs/dev/datastream/execution/parallel.md
index 8125ec6b95f..0b0b7e67ca8 100644
--- a/docs/content/docs/dev/datastream/execution/parallel.md
+++ b/docs/content/docs/dev/datastream/execution/parallel.md
@@ -239,6 +239,8 @@ Setting the maximum parallelism to a very large
value can be detrimental to performance because some state backends have to
keep internal data
structures that scale with the number of key-groups (which are the internal
implementation mechanism for
rescalable state).
+
+Changing the maximum parallelism explicitly when recovery from original job
will lead to state incompatibility.
{{< /hint >}}
{{< top >}}
diff --git a/docs/layouts/shortcodes/generated/pipeline_configuration.html
b/docs/layouts/shortcodes/generated/pipeline_configuration.html
index 28e141ae334..6a6719f95ef 100644
--- a/docs/layouts/shortcodes/generated/pipeline_configuration.html
+++ b/docs/layouts/shortcodes/generated/pipeline_configuration.html
@@ -90,7 +90,7 @@
<td><h5>pipeline.max-parallelism</h5></td>
<td style="word-wrap: break-word;">-1</td>
<td>Integer</td>
- <td>The program-wide maximum parallelism used for operators which
haven't specified a maximum parallelism. The maximum parallelism specifies the
upper limit for dynamic scaling and the number of key groups used for
partitioned state.</td>
+ <td>The program-wide maximum parallelism used for operators which
haven't specified a maximum parallelism. The maximum parallelism specifies the
upper limit for dynamic scaling and the number of key groups used for
partitioned state. Changing the value explicitly when recovery from original
job will lead to state incompatibility.</td>
</tr>
<tr>
<td><h5>pipeline.name</h5></td>
diff --git
a/flink-core/src/main/java/org/apache/flink/configuration/PipelineOptions.java
b/flink-core/src/main/java/org/apache/flink/configuration/PipelineOptions.java
index 34049474ba8..f3a2c313600 100644
---
a/flink-core/src/main/java/org/apache/flink/configuration/PipelineOptions.java
+++
b/flink-core/src/main/java/org/apache/flink/configuration/PipelineOptions.java
@@ -181,7 +181,8 @@ public class PipelineOptions {
.withDescription(
"The program-wide maximum parallelism used for
operators which haven't specified a"
+ " maximum parallelism. The maximum
parallelism specifies the upper limit for dynamic scaling and"
- + " the number of key groups used for
partitioned state.");
+ + " the number of key groups used for
partitioned state."
+ + " Changing the value explicitly when
recovery from original job will lead to state incompatibility.");
public static final ConfigOption<Boolean> OBJECT_REUSE =
key("pipeline.object-reuse")