[jira] [Updated] (FLINK-36015) Align rescale parameters
[ https://issues.apache.org/jira/browse/FLINK-36015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zdenek Tison updated FLINK-36015: - Release Note: FLIP-472 aligns timeout logic in AdaptiveScheduler states. To make alignment more clear to users the configuration has also been alignment: - Parameter `jobmanager.adaptive-scheduler.resource-wait-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-wait-timeout` - Parameter `jobmanager.adaptive-scheduler.resource-stabilization-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.min` was renamed to the `jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.max` was replaced by the `jobmanager.adaptive-scheduler.executing.resource-stabilization-timeout` with a default value of 60s. - Parameter `jobmanager.adaptive-scheduler.min-parallelism-increase` was removed without a replacement. The following parameters introduced in Flink 2.0 have been also renamed: - Parameter `jobmanager.adaptive-scheduler.max-delay-for-scale-trigger` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-delays` - Parameter `jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` See Flink's configuration documentation for further details. was: FLIP-472 aligns timeout logic in AdaptiveScheduler states. To make alignment more clear to users the configuration has also been alignment: - Parameter `jobmanager.adaptive-scheduler.resource-wait-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-wait-timeout` - Parameter `jobmanager.adaptive-scheduler.resource-stabilization-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.min` was renamed to the `jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.max` was replaced by the `jobmanager.adaptive-scheduler.executing.resource-stabilization-timeout` with a default value of 60s. - Parameter `jobmanager.adaptive-scheduler.min-parallelism-increase` was removed without a replacement. The following parameters introduced in Flink 2.0 have been also renamed: - Parameter `jobmanager.adaptive-scheduler.max-delay-for-scale-trigger` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` - Parameter `jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` See Flink's configuration documentation for further details. > Align rescale parameters > > > Key: FLINK-36015 > URL: https://issues.apache.org/jira/browse/FLINK-36015 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Reporter: Zdenek Tison >Priority: Major > Labels: pull-request-available > Fix For: 2.0-preview > > > * Parameter > [_jobmanager.adaptive-scheduler.resource-wait-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] > will be renamed to the > jobmanager.adaptive-scheduler.submission.resource-wait-timeout > * Parameter > [_jobmanager.adaptive-scheduler.resource-stabilization-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] > will be renamed to the > jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout > * Parameter > {_}j{_}[_obmanager.adaptive-scheduler.scaling-interval.min_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-min] > will be renamed to the > jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling > * Parameter > [_jobmanager.adaptive-scheduler.scaling-interval.max_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-max] > will be renamed to the > {_}jobmanager.adaptive-scheduler{_}{_}.{_}executing.resource-stabilization-timeout > with default value 60s. > * Parameter > [jobmanager.adaptive-scheduler.min-parallelism-increase|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-min-parallelism-increase] > will be removed without a direct replacement. Still, it will be superseded > by combining the parameters > jobmanager.adaptive-schedule
[jira] [Updated] (FLINK-36015) Align rescale parameters
[ https://issues.apache.org/jira/browse/FLINK-36015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zdenek Tison updated FLINK-36015: - Release Note: FLIP-472 aligns timeout logic in AdaptiveScheduler states. To make alignment more clear to users the configuration has also been alignment: - Parameter `jobmanager.adaptive-scheduler.resource-wait-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-wait-timeout` - Parameter `jobmanager.adaptive-scheduler.resource-stabilization-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.min` was renamed to the `jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.max` was replaced by the `jobmanager.adaptive-scheduler.executing.resource-stabilization-timeout` with a default value of 60s. - Parameter `jobmanager.adaptive-scheduler.min-parallelism-increase` was removed without a replacement. The following parameters introduced in Flink 2.0 have been also renamed: - Parameter `jobmanager.adaptive-scheduler.max-delay-for-scale-trigger` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` - Parameter `jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` See Flink's configuration documentation for further details. was: AdpativeScheduler configuration has been aligned for different AdpativeScheduler stages: - Parameter `jobmanager.adaptive-scheduler.resource-wait-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-wait-timeout` - Parameter `jobmanager.adaptive-scheduler.resource-stabilization-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.min` was renamed to the `jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.max` was replaced by the `jobmanager.adaptive-scheduler.executing.resource-stabilization-timeout` with a default value of 60s. - Parameter `jobmanager.adaptive-scheduler.min-parallelism-increase` was removed without a replacement. The following parameters introduced in Flink 2.0 have been also renamed: - Parameter `jobmanager.adaptive-scheduler.max-delay-for-scale-trigger` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` - Parameter `jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` > Align rescale parameters > > > Key: FLINK-36015 > URL: https://issues.apache.org/jira/browse/FLINK-36015 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Reporter: Zdenek Tison >Priority: Major > Labels: pull-request-available > Fix For: 2.0-preview > > > * Parameter > [_jobmanager.adaptive-scheduler.resource-wait-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] > will be renamed to the > jobmanager.adaptive-scheduler.submission.resource-wait-timeout > * Parameter > [_jobmanager.adaptive-scheduler.resource-stabilization-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] > will be renamed to the > jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout > * Parameter > {_}j{_}[_obmanager.adaptive-scheduler.scaling-interval.min_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-min] > will be renamed to the > jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling > * Parameter > [_jobmanager.adaptive-scheduler.scaling-interval.max_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-max] > will be renamed to the > {_}jobmanager.adaptive-scheduler{_}{_}.{_}executing.resource-stabilization-timeout > with default value 60s. > * Parameter > [jobmanager.adaptive-scheduler.min-parallelism-increase|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-min-parallelism-increase] > will be removed without a direct replacement. Still, it will be superseded > by combining the parameters > jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling and > {_}jobmanager.adaptive-scheduler{_}{_}.{_}executing.reso
[jira] [Updated] (FLINK-36015) Align rescale parameters
[ https://issues.apache.org/jira/browse/FLINK-36015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zdenek Tison updated FLINK-36015: - Release Note: AdpativeScheduler configuration has been aligned for different AdpativeScheduler stages: - Parameter `jobmanager.adaptive-scheduler.resource-wait-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-wait-timeout` - Parameter `jobmanager.adaptive-scheduler.resource-stabilization-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.min` was renamed to the `jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.max` was replaced by the `jobmanager.adaptive-scheduler.executing.resource-stabilization-timeout` with a default value of 60s. - Parameter `jobmanager.adaptive-scheduler.min-parallelism-increase` was removed without a replacement. The following parameters introduced in Flink 2.0 have been also renamed: - Parameter `jobmanager.adaptive-scheduler.max-delay-for-scale-trigger` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` - Parameter `jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` was: - Parameter `jobmanager.adaptive-scheduler.resource-wait-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-wait-timeout` - Parameter `jobmanager.adaptive-scheduler.resource-stabilization-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.min` was renamed to the `jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.max` was replaced by the `jobmanager.adaptive-scheduler.executing.resource-stabilization-timeout` with a default value of 60s. - Parameter `jobmanager.adaptive-scheduler.min-parallelism-increase` was removed without a replacement. The following parameters introduced in Flink 2.0 have been also renamed: - Parameter `jobmanager.adaptive-scheduler.max-delay-for-scale-trigger` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` - Parameter `jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` > Align rescale parameters > > > Key: FLINK-36015 > URL: https://issues.apache.org/jira/browse/FLINK-36015 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Reporter: Zdenek Tison >Priority: Major > Labels: pull-request-available > Fix For: 2.0-preview > > > * Parameter > [_jobmanager.adaptive-scheduler.resource-wait-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] > will be renamed to the > jobmanager.adaptive-scheduler.submission.resource-wait-timeout > * Parameter > [_jobmanager.adaptive-scheduler.resource-stabilization-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] > will be renamed to the > jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout > * Parameter > {_}j{_}[_obmanager.adaptive-scheduler.scaling-interval.min_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-min] > will be renamed to the > jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling > * Parameter > [_jobmanager.adaptive-scheduler.scaling-interval.max_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-max] > will be renamed to the > {_}jobmanager.adaptive-scheduler{_}{_}.{_}executing.resource-stabilization-timeout > with default value 60s. > * Parameter > [jobmanager.adaptive-scheduler.min-parallelism-increase|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-min-parallelism-increase] > will be removed without a direct replacement. Still, it will be superseded > by combining the parameters > jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling and > {_}jobmanager.adaptive-scheduler{_}{_}.{_}executing.resource-stabilization-timeout -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-36015) Align rescale parameters
[ https://issues.apache.org/jira/browse/FLINK-36015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zdenek Tison updated FLINK-36015: - Fix Version/s: 2.0-preview Release Note: Parameter `jobmanager.adaptive-scheduler.resource-wait-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-wait-timeout` - Parameter `jobmanager.adaptive-scheduler.resource-stabilization-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.min` was renamed to the `jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.max` was replaced by the `jobmanager.adaptive-scheduler.executing.resource-stabilization-timeout` with a default value of 60s. - Parameter `jobmanager.adaptive-scheduler.min-parallelism-increase` was removed without a replacement. The following parameters are new in 2.0: - Parameter `jobmanager.adaptive-scheduler.max-delay-for-scale-trigger` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` - Parameter `jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` > Align rescale parameters > > > Key: FLINK-36015 > URL: https://issues.apache.org/jira/browse/FLINK-36015 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Reporter: Zdenek Tison >Priority: Major > Labels: pull-request-available > Fix For: 2.0-preview > > > * Parameter > [_jobmanager.adaptive-scheduler.resource-wait-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] > will be renamed to the > jobmanager.adaptive-scheduler.submission.resource-wait-timeout > * Parameter > [_jobmanager.adaptive-scheduler.resource-stabilization-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] > will be renamed to the > jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout > * Parameter > {_}j{_}[_obmanager.adaptive-scheduler.scaling-interval.min_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-min] > will be renamed to the > jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling > * Parameter > [_jobmanager.adaptive-scheduler.scaling-interval.max_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-max] > will be renamed to the > {_}jobmanager.adaptive-scheduler{_}{_}.{_}executing.resource-stabilization-timeout > with default value 60s. > * Parameter > [jobmanager.adaptive-scheduler.min-parallelism-increase|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-min-parallelism-increase] > will be removed without a direct replacement. Still, it will be superseded > by combining the parameters > jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling and > {_}jobmanager.adaptive-scheduler{_}{_}.{_}executing.resource-stabilization-timeout -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-36015) Align rescale parameters
[ https://issues.apache.org/jira/browse/FLINK-36015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zdenek Tison updated FLINK-36015: - Release Note: - Parameter `jobmanager.adaptive-scheduler.resource-wait-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-wait-timeout` - Parameter `jobmanager.adaptive-scheduler.resource-stabilization-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.min` was renamed to the `jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.max` was replaced by the `jobmanager.adaptive-scheduler.executing.resource-stabilization-timeout` with a default value of 60s. - Parameter `jobmanager.adaptive-scheduler.min-parallelism-increase` was removed without a replacement. The following parameters are new in 2.0: - Parameter `jobmanager.adaptive-scheduler.max-delay-for-scale-trigger` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` - Parameter `jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` was: Parameter `jobmanager.adaptive-scheduler.resource-wait-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-wait-timeout` - Parameter `jobmanager.adaptive-scheduler.resource-stabilization-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.min` was renamed to the `jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.max` was replaced by the `jobmanager.adaptive-scheduler.executing.resource-stabilization-timeout` with a default value of 60s. - Parameter `jobmanager.adaptive-scheduler.min-parallelism-increase` was removed without a replacement. The following parameters are new in 2.0: - Parameter `jobmanager.adaptive-scheduler.max-delay-for-scale-trigger` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` - Parameter `jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` > Align rescale parameters > > > Key: FLINK-36015 > URL: https://issues.apache.org/jira/browse/FLINK-36015 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Reporter: Zdenek Tison >Priority: Major > Labels: pull-request-available > Fix For: 2.0-preview > > > * Parameter > [_jobmanager.adaptive-scheduler.resource-wait-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] > will be renamed to the > jobmanager.adaptive-scheduler.submission.resource-wait-timeout > * Parameter > [_jobmanager.adaptive-scheduler.resource-stabilization-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] > will be renamed to the > jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout > * Parameter > {_}j{_}[_obmanager.adaptive-scheduler.scaling-interval.min_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-min] > will be renamed to the > jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling > * Parameter > [_jobmanager.adaptive-scheduler.scaling-interval.max_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-max] > will be renamed to the > {_}jobmanager.adaptive-scheduler{_}{_}.{_}executing.resource-stabilization-timeout > with default value 60s. > * Parameter > [jobmanager.adaptive-scheduler.min-parallelism-increase|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-min-parallelism-increase] > will be removed without a direct replacement. Still, it will be superseded > by combining the parameters > jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling and > {_}jobmanager.adaptive-scheduler{_}{_}.{_}executing.resource-stabilization-timeout -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-36015) Align rescale parameters
[ https://issues.apache.org/jira/browse/FLINK-36015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zdenek Tison updated FLINK-36015: - Release Note: - Parameter `jobmanager.adaptive-scheduler.resource-wait-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-wait-timeout` - Parameter `jobmanager.adaptive-scheduler.resource-stabilization-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.min` was renamed to the `jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.max` was replaced by the `jobmanager.adaptive-scheduler.executing.resource-stabilization-timeout` with a default value of 60s. - Parameter `jobmanager.adaptive-scheduler.min-parallelism-increase` was removed without a replacement. The following parameters introduced in Flink 2.0 have been also renamed: - Parameter `jobmanager.adaptive-scheduler.max-delay-for-scale-trigger` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` - Parameter `jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` was: - Parameter `jobmanager.adaptive-scheduler.resource-wait-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-wait-timeout` - Parameter `jobmanager.adaptive-scheduler.resource-stabilization-timeout` was renamed to the `jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.min` was renamed to the `jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling` - Parameter `jobmanager.adaptive-scheduler.scaling-interval.max` was replaced by the `jobmanager.adaptive-scheduler.executing.resource-stabilization-timeout` with a default value of 60s. - Parameter `jobmanager.adaptive-scheduler.min-parallelism-increase` was removed without a replacement. The following parameters are new in 2.0: - Parameter `jobmanager.adaptive-scheduler.max-delay-for-scale-trigger` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` - Parameter `jobmanager.adaptive-scheduler.scale-on-failed-checkpoints-count` was renamed to the `jobmanager.adaptive-scheduler.rescale-trigger.max-checkpoint-failures` > Align rescale parameters > > > Key: FLINK-36015 > URL: https://issues.apache.org/jira/browse/FLINK-36015 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Configuration >Reporter: Zdenek Tison >Priority: Major > Labels: pull-request-available > Fix For: 2.0-preview > > > * Parameter > [_jobmanager.adaptive-scheduler.resource-wait-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] > will be renamed to the > jobmanager.adaptive-scheduler.submission.resource-wait-timeout > * Parameter > [_jobmanager.adaptive-scheduler.resource-stabilization-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] > will be renamed to the > jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout > * Parameter > {_}j{_}[_obmanager.adaptive-scheduler.scaling-interval.min_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-min] > will be renamed to the > jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling > * Parameter > [_jobmanager.adaptive-scheduler.scaling-interval.max_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-max] > will be renamed to the > {_}jobmanager.adaptive-scheduler{_}{_}.{_}executing.resource-stabilization-timeout > with default value 60s. > * Parameter > [jobmanager.adaptive-scheduler.min-parallelism-increase|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-min-parallelism-increase] > will be removed without a direct replacement. Still, it will be superseded > by combining the parameters > jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling and > {_}jobmanager.adaptive-scheduler{_}{_}.{_}executing.resource-stabilization-timeout -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-36279) AdaptiveScheduler#hasDesiredResources doesn't rely on all available slots which causes problems in Executing state
[ https://issues.apache.org/jira/browse/FLINK-36279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882589#comment-17882589 ] Zdenek Tison commented on FLINK-36279: -- [~mapohl] Thank you very much for taking care of this bug. > AdaptiveScheduler#hasDesiredResources doesn't rely on all available slots > which causes problems in Executing state > -- > > Key: FLINK-36279 > URL: https://issues.apache.org/jira/browse/FLINK-36279 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination >Affects Versions: 2.0-preview >Reporter: Matthias Pohl >Assignee: Matthias Pohl >Priority: Blocker > Labels: pull-request-available > Fix For: 2.0-preview > > Attachments: FLINK-36279-FLINK-36014-pr.success.log, > FLINK-36279.20240914.6.success.log, FLINK-36279.fixed.success.log > > > FLINK-36014 aligned the triggering of the execution graph creation in > {{WaitingForResources}} and rescaling in {{Executing}} state. Before that > change, only {{WaitingForResources}} relied on this method. Relying on free > slots was good enough because in {{WaitingForResources}} state, there are no > slots allocated, yet. > Using this method for {{Executing}} state now as well changes this premise > because there are slots allocated while checking the slot availability that > would become available after the restart. Hence, considering these currently > allocated slots as well in the slot availability check is good enough. This > will not break the premise for the {{WaitingForResources}} state. > {{RescaleOnCheckpointITCase}} fails because of that issue: > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=62105&view=logs&j=5c8e7682-d68f-54d1-16a2-a09310218a49&t=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba&l=11287 > {code} > Sep 13 17:16:55 "ForkJoinPool-1-worker-25" #28 daemon prio=5 os_prio=0 > tid=0x7f973f0c2800 nid=0x31a1 waiting on condition [0x7f97089fc000] > Sep 13 17:16:55java.lang.Thread.State: TIMED_WAITING (sleeping) > Sep 13 17:16:55 at java.lang.Thread.sleep(Native Method) > Sep 13 17:16:55 at > org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:152) > Sep 13 17:16:55 at > org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:145) > Sep 13 17:16:55 at > org.apache.flink.test.scheduling.UpdateJobResourceRequirementsITCase.waitForRunningTasks(UpdateJobResourceRequirementsITCase.java:219) > Sep 13 17:16:55 at > org.apache.flink.test.scheduling.RescaleOnCheckpointITCase.testRescaleOnCheckpoint(RescaleOnCheckpointITCase.java:139) > Sep 13 17:16:55 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Sep 13 17:16:55 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > [...] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-36011) Generalize RescaleManager to become StateTransitionManager
[ https://issues.apache.org/jira/browse/FLINK-36011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zdenek Tison resolved FLINK-36011. -- Resolution: Implemented > Generalize RescaleManager to become StateTransitionManager > -- > > Key: FLINK-36011 > URL: https://issues.apache.org/jira/browse/FLINK-36011 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination >Reporter: Zdenek Tison >Assignee: Zdenek Tison >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > The goal is to change the RescaleManager component to one with a broader > responsibility that will manage the adaptive scheduler's state transitions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-36016) Synchronize initialization time and clock usage
Zdenek Tison created FLINK-36016: Summary: Synchronize initialization time and clock usage Key: FLINK-36016 URL: https://issues.apache.org/jira/browse/FLINK-36016 Project: Flink Issue Type: Sub-task Reporter: Zdenek Tison StateTransitionManager's initialization time and the clock parameter should be based on the same time. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-36015) Align rescale parameters
Zdenek Tison created FLINK-36015: Summary: Align rescale parameters Key: FLINK-36015 URL: https://issues.apache.org/jira/browse/FLINK-36015 Project: Flink Issue Type: Sub-task Components: Runtime / Configuration Reporter: Zdenek Tison * Parameter [_jobmanager.adaptive-scheduler.resource-wait-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] will be renamed to the jobmanager.adaptive-scheduler.submission.resource-wait-timeout * Parameter [_jobmanager.adaptive-scheduler.resource-stabilization-timeout_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-wait-timeout] will be renamed to the jobmanager.adaptive-scheduler.submission.resource-stabilization-timeout * Parameter {_}j{_}[_obmanager.adaptive-scheduler.scaling-interval.min_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-min] will be renamed to the jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling * Parameter [_jobmanager.adaptive-scheduler.scaling-interval.max_|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-scaling-interval-max] will be renamed to the {_}jobmanager.adaptive-scheduler{_}{_}.{_}executing.resource-stabilization-timeout with default value 60s. * Parameter [jobmanager.adaptive-scheduler.min-parallelism-increase|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-min-parallelism-increase] will be removed without a direct replacement. Still, it will be superseded by combining the parameters jobmanager.adaptive-scheduler.executing.cooldown-after-rescaling and {_}jobmanager.adaptive-scheduler{_}{_}.{_}executing.resource-stabilization-timeout -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-36014) Align the desired and sufficient resources definiton in Executing and WaitForResources states
Zdenek Tison created FLINK-36014: Summary: Align the desired and sufficient resources definiton in Executing and WaitForResources states Key: FLINK-36014 URL: https://issues.apache.org/jira/browse/FLINK-36014 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Reporter: Zdenek Tison The goal is to use the same definition for the desired and sufficient resources in the Executing state as in the WaitingForResources state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-36013) Introduce the transition from Restarting to CreatingExecutionGraph state
Zdenek Tison created FLINK-36013: Summary: Introduce the transition from Restarting to CreatingExecutionGraph state Key: FLINK-36013 URL: https://issues.apache.org/jira/browse/FLINK-36013 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Reporter: Zdenek Tison The AdaptiveScheduler omits the WaitingForResources state when rescaling. Pass a flag into the Restarting state that directs the state transition to the CreatingExecutinggraph instead of WaitingForResources. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-36012) Integrate StateTransitionManager into WaitingForResources state
Zdenek Tison created FLINK-36012: Summary: Integrate StateTransitionManager into WaitingForResources state Key: FLINK-36012 URL: https://issues.apache.org/jira/browse/FLINK-36012 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Reporter: Zdenek Tison The StateTransitionManager will be used in the WaitingForResources state to manage the transition to a subsequent state. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-36011) Generalize RescaleManager to become StateTransitionManager
Zdenek Tison created FLINK-36011: Summary: Generalize RescaleManager to become StateTransitionManager Key: FLINK-36011 URL: https://issues.apache.org/jira/browse/FLINK-36011 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Reporter: Zdenek Tison Fix For: 2.0.0 The goal is to change the RescaleManager component to one with a broader responsibility that will manage the adaptive scheduler's state transitions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-35035) Reduce job pause time when cluster resources are expanded in adaptive mode
[ https://issues.apache.org/jira/browse/FLINK-35035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17872011#comment-17872011 ] Zdenek Tison commented on FLINK-35035: -- This task will be used as a parent task for changes proposed in [FLIP-472|https://cwiki.apache.org/confluence/display/FLINK/FLIP-472%3A+Aligning+timeout+logic+in+the+AdaptiveScheduler%27s+WaitingForResources+and+Executing+states] > Reduce job pause time when cluster resources are expanded in adaptive mode > -- > > Key: FLINK-35035 > URL: https://issues.apache.org/jira/browse/FLINK-35035 > Project: Flink > Issue Type: Improvement > Components: Runtime / Task >Affects Versions: 1.19.0 >Reporter: yuanfenghu >Assignee: Zdenek Tison >Priority: Minor > > When 'jobmanager.scheduler = adaptive' , job graph changes triggered by > cluster expansion will cause long-term task stagnation. We should reduce this > impact. > As an example: > I have jobgraph for : [v1 (maxp=10 minp = 1)] -> [v2 (maxp=10, minp=1)] > When my cluster has 5 slots, the job will be executed as [v1 p5]->[v2 p5] > When I add slots the task will trigger jobgraph changes,by > org.apache.flink.runtime.scheduler.adaptive.ResourceListener#onNewResourcesAvailable, > However, the five new slots I added were not discovered at the same time (for > convenience, I assume that a taskmanager has one slot), because no matter > what environment we add, we cannot guarantee that the new slots will be added > at once, so this will cause onNewResourcesAvailable triggers repeatedly > ,If each new slot action has a certain interval, then the jobgraph will > continue to change during this period. What I hope is that there will be a > stable time to configure the cluster resources and then go to it after the > number of cluster slots has been stable for a certain period of time. Trigger > jobgraph changes to avoid this situation -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-35035) Reduce job pause time when cluster resources are expanded in adaptive mode
[ https://issues.apache.org/jira/browse/FLINK-35035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865350#comment-17865350 ] Zdenek Tison commented on FLINK-35035: -- Hi, [~echauchot] [~heigebupahei] I took over [~mapohl]'s work and prepared the FLIP document on the discussed topic. Please take a look if the topic is still relevant to you. Thanks https://lists.apache.org/thread/krnjv8fm62nbnrljmk3bfoons86pc1dw > Reduce job pause time when cluster resources are expanded in adaptive mode > -- > > Key: FLINK-35035 > URL: https://issues.apache.org/jira/browse/FLINK-35035 > Project: Flink > Issue Type: Improvement > Components: Runtime / Task >Affects Versions: 1.19.0 >Reporter: yuanfenghu >Priority: Minor > > When 'jobmanager.scheduler = adaptive' , job graph changes triggered by > cluster expansion will cause long-term task stagnation. We should reduce this > impact. > As an example: > I have jobgraph for : [v1 (maxp=10 minp = 1)] -> [v2 (maxp=10, minp=1)] > When my cluster has 5 slots, the job will be executed as [v1 p5]->[v2 p5] > When I add slots the task will trigger jobgraph changes,by > org.apache.flink.runtime.scheduler.adaptive.ResourceListener#onNewResourcesAvailable, > However, the five new slots I added were not discovered at the same time (for > convenience, I assume that a taskmanager has one slot), because no matter > what environment we add, we cannot guarantee that the new slots will be added > at once, so this will cause onNewResourcesAvailable triggers repeatedly > ,If each new slot action has a certain interval, then the jobgraph will > continue to change during this period. What I hope is that there will be a > stable time to configure the cluster resources and then go to it after the > number of cluster slots has been stable for a certain period of time. Trigger > jobgraph changes to avoid this situation -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30403) The reported latest completed checkpoint is discarded
[ https://issues.apache.org/jira/browse/FLINK-30403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17692567#comment-17692567 ] Zdenek Tison commented on FLINK-30403: -- Hi, thanks for asking. No, let's close it. > The reported latest completed checkpoint is discarded > - > > Key: FLINK-30403 > URL: https://issues.apache.org/jira/browse/FLINK-30403 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.16.0 >Reporter: Zdenek Tison >Priority: Major > > There is a small window where the reported latest completed checkpoint can be > marked as discarded while the new checkpoint wasn't reported yet. > The reason is that the function > _addCompletedCheckpointToStoreAndSubsumeOldest_ is called before > _reportCompletedCheckpoint_ in _CheckpointCoordinator._ > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30403) The reported latest completed checkpoint is discarded
Zdenek Tison created FLINK-30403: Summary: The reported latest completed checkpoint is discarded Key: FLINK-30403 URL: https://issues.apache.org/jira/browse/FLINK-30403 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.16.0 Reporter: Zdenek Tison There is a small window where the reported latest completed checkpoint can be marked as discarded while the new checkpoint wasn't reported yet. The reason is that the function _addCompletedCheckpointToStoreAndSubsumeOldest_ is called before _reportCompletedCheckpoint_ in _CheckpointCoordinator._ -- This message was sent by Atlassian Jira (v8.20.10#820010)