[ 
https://issues.apache.org/jira/browse/FLINK-37018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17910428#comment-17910428
 ] 

Zhenqiu Huang commented on FLINK-37018:
---------------------------------------

BTW, we are testing on Flink 1.18. 

> Adaptive scheduler triggers multiple internal restarts for a single rescale 
> event
> ---------------------------------------------------------------------------------
>
>                 Key: FLINK-37018
>                 URL: https://issues.apache.org/jira/browse/FLINK-37018
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Task
>    Affects Versions: 1.18.0
>            Reporter: Sai Sharath Dandi
>            Priority: Major
>         Attachments: jobmanager.log
>
>
> We observe that a single rescale event from autoscaler triggers multiple 
> internal restarts by the adaptive scheduler despite the job not having any 
> other reason/exception for internal restarts. There can be 2-3 restarts over 
> a very short period (1-2 mins) before the job stabilizes.
> In the attached job manager logs, we can see there are
>  # Can change the parallelism of job. Restarting job. (17 times)
>  # Received resource requirements from job (7 times).
>  
> The job was internal restarted 17 times despite receiving only 7 requests 
> from the autoscaler for rescalings



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to