[
https://issues.apache.org/jira/browse/FLINK-31457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17700503#comment-17700503
]
Junrui Li commented on FLINK-31457:
-----------------------------------
[~a.pilipenko] I'm not sure what is the scenario where
`NoResourceAvailableException` will be reported after job restart? Can you
describe it in detail?
IIUC, if it is a session cluster, the slot may be occupied by other jobs after
slot idle timeout. Maybe you can increase the slot.idle.timeout.
In addition, the adaptive scheduler has a mechanism to wait for resources
because it can dynamically adjust the parallelism, and run jobs with a small
parallelism when resources are insufficient, while the default scheduler does
not have such a capability, so when resources are insufficient, it will report
`NoResourceAvailableException`. If you want to run jobs even when resources are
insufficient, you can use the adaptive scheduler in stream job.
> Support waiting for required resources in DefaultScheduler during job restart
> -----------------------------------------------------------------------------
>
> Key: FLINK-31457
> URL: https://issues.apache.org/jira/browse/FLINK-31457
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Affects Versions: 1.15.3
> Reporter: Aleksandr Pilipenko
> Priority: Major
>
> Currently Flink support [waiting for required resources to become
> available|https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#jobmanager-adaptive-scheduler-resource-stabilization-timeout]
> during job restart only while using adaptive scheduler.
> On the other hand, if cluster is using default scheduler and there is not
> enough slots available - restart attempts will fail with
> `NoResourceAvailableException` until resource requirements are satisfied.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)