[ 
https://issues.apache.org/jira/browse/FLINK-33768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xingbe closed FLINK-33768.
--------------------------
    Release Note: 
In Flink 1.19, we have supported dynamic source parallelism inference for batch 
jobs, which allows source connectors to dynamically infer the parallelism based 
on the actual amount of data to consume. This feature is a significant 
improvement over previous versions, which only assigned a fixed default 
parallelism to source vertices. 

Source connectors need to implement the inference interface to enable dynamic 
parallelism inference. Currently, the FileSource connector has already been 
developed with this functionality in place.

Additionally, the configuration 
`execution.batch.adaptive.auto-parallelism.default-source-parallelism` will be 
used as the upper bound of source parallelism inference. And now it will not 
default to 1. Instead, if it is not set, the upper bound of allowed parallelism 
set via `execution.batch.adaptive.auto-parallelism.max-parallelism` will be 
used. If that configuration is also not set, the default parallelism set via 
`parallelism.default` or StreamExecutionEnvironment#setParallelism() will be 
used instead.
      Resolution: Done

> FLIP-379: Support dynamic source parallelism inference for batch jobs
> ---------------------------------------------------------------------
>
>                 Key: FLINK-33768
>                 URL: https://issues.apache.org/jira/browse/FLINK-33768
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.19.0
>            Reporter: xingbe
>            Assignee: xingbe
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.19.0
>
>
> Currently, for JobVertices without parallelism configured, the 
> AdaptiveBatchScheduler dynamically infers the vertex parallelism based on the 
> volume of input data. Specifically, for Source vertices, it uses the value of 
> `{*}execution.batch.adaptive.auto-parallelism.default-source-parallelism{*}` 
> as the fixed parallelism. If this is not set by the user, the default value 
> of {{1}}  is used as the source parallelism, which is actually a temporary 
> implementation solution.
> We aim to support dynamic source parallelism inference for batch jobs. More 
> details see 
> [FLIP-379|https://cwiki.apache.org/confluence/display/FLINK/FLIP-379%3A+Dynamic+source+parallelism+inference+for+batch+jobs].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to