zhuzhurk commented on code in PR #21801:
URL: https://github.com/apache/flink/pull/21801#discussion_r1099898172
##########
docs/content/docs/deployment/elastic_scaling.md:
##########
@@ -161,36 +161,37 @@ The Adaptive Batch Scheduler can automatically decide
parallelisms of operators
To automatically decide parallelisms for operators with Adaptive Batch
Scheduler, you need to:
- Configure to use Adaptive Batch Scheduler.
-- Set the parallelism of operators to `-1`.
+- Avoid setting the parallelism of operators.
#### Configure to use Adaptive Batch Scheduler
-To use Adaptive Batch Scheduler, you need to:
-- Set `jobmanager.scheduler: AdaptiveBatch`.
-- Leave the [`execution.batch-shuffle-mode`]({{< ref "docs/deployment/config"
>}}#execution-batch-shuffle-mode) unset or explicitly set it to
`ALL-EXCHANGES-BLOCKING` (default value) due to ["ALL-EXCHANGES-BLOCKING jobs
only"](#limitations-2).
-
-In addition, there are several related configuration options that may need
adjustment when using Adaptive Batch Scheduler:
-- [`jobmanager.adaptive-batch-scheduler.min-parallelism`]({{< ref
"docs/deployment/config"
>}}#jobmanager-adaptive-batch-scheduler-min-parallelism): The lower bound of
allowed parallelism to set adaptively.
-- [`jobmanager.adaptive-batch-scheduler.max-parallelism`]({{< ref
"docs/deployment/config"
>}}#jobmanager-adaptive-batch-scheduler-max-parallelism): The upper bound of
allowed parallelism to set adaptively.
-- [`jobmanager.adaptive-batch-scheduler.avg-data-volume-per-task`]({{< ref
"docs/deployment/config"
>}}#jobmanager-adaptive-batch-scheduler-avg-data-volume-per-task): The average
size of data volume to expect each task instance to process. Note that when
data skew occurs, or the decided parallelism reaches the max parallelism (due
to too much data), the data actually processed by some tasks may far exceed
this value.
-- [`jobmanager.adaptive-batch-scheduler.default-source-parallelism`]({{< ref
"docs/deployment/config"
>}}#jobmanager-adaptive-batch-scheduler-default-source-parallelism): The
default parallelism of data source.
-
-#### Set the parallelism of operators to `-1`
-Adaptive Batch Scheduler will only decide parallelism for operators whose
parallelism is not specified by users (parallelism is `-1`). So if you want the
parallelism of operators to be decided automatically, you should configure as
follows:
-- Set `parallelism.default: -1`
-- Set `table.exec.resource.default-parallelism: -1` in SQL jobs.
-- Don't call `setParallelism()` for operators in DataStream/DataSet jobs.
-- Don't call `setParallelism()` on
`StreamExecutionEnvironment/ExecutionEnvironment` in DataStream/DataSet jobs.
+At present, the Adaptive Batch Scheduler is the default scheduler for Flink
batch jobs. No additional configuration is required unless other schedulers are
explicitly configured, e.g. `jobmanager.scheduler: default`. Note that you need
to
+leave the [`execution.batch-shuffle-mode`]({{< ref "docs/deployment/config"
>}}#execution-batch-shuffle-mode) unset or explicitly set it to
`ALL-EXCHANGES-BLOCKING` (default value) due to ["ALL-EXCHANGES-BLOCKING jobs
only"](#limitations-2).
+
+#### Configure to automatically decide parallelisms for operators
+Adaptive Batch Scheduler enables automatic parallelism derivation by default,
you can configure [`execution.batch.adaptive.auto-parallelism.enabled`]({{< ref
"docs/deployment/config" >}}#execution-batch-adaptive-auto-parallelism-enabled)
to switch this feature.
+In addition, there are several related configuration options that may need
adjustment when using Adaptive Batch Scheduler automatically decide
parallelisms for operators:
Review Comment:
automatically -> to automatically
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]