[
https://issues.apache.org/jira/browse/FLINK-31706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17721776#comment-17721776
]
Lijie Wang commented on FLINK-31706:
------------------------------------
I think it's a good idea to use {{paralleism.default}} instread of the
{{execution.batch.adaptive.auto-parallelism.default-source-parallelism}}.
Regarding the parallelism of Source in the adaptive batch scheduler, we also
have some other ideas/actions in plan: dynamically infer the Source paralleism
at runtime (according to the amount of data that Source actually needs to read
after Dynamic Partition Pruning). One possible way is that the source
coordinator can infer the parallelism based on the splits information actually
consumed.
At that time, if the parallelism of Source are not specified by the user, the
source coorinator will be responseible for inferring the parallelism
automatically(if it supports). If the Source does not support inferring
parallelism automatically, {{parallelism.default}} will be used as the
parallelism of the Source. (An initial thought :))
> The default source parallelism should be the same as execution's default
> parallelism under adaptive batch scheduler
> -------------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-31706
> URL: https://issues.apache.org/jira/browse/FLINK-31706
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Reporter: Yun Tang
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.18.0
>
>
> Currently, the sources need to set
> {{execution.batch.adaptive.auto-parallelism.default-source-parallelism }} in
> the adaptive batch scheduler mode, otherwise, the source parallelism is only
> 1 by default. A better solution might be set as the default execution
> parallelism if no user configured.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)