1996fanrui opened a new pull request, #24248:
URL: https://github.com/apache/flink/pull/24248

   ## Purpose
   
   AutoRescalingITCase#testCheckpointRescalingWithKeyedAndNonPartitionedState 
may hang in waitForRunningTasks(restClusterClient, jobID, parallelism2);
   
   ## Reason:
   
   The job has 2 tasks(vertices), after calling updateJobResourceRequirements. 
The source parallelism isn't changed (It's parallelism) , and the 
FlatMapper+Sink is changed from  parallelism to parallelism2.
   
   So we expect the task number should be parallelism + parallelism2 instead of 
parallelism2.
   
    
   ## Why it can be passed for now?
   
   Flink 1.19 supports the scaling cooldown, and the cooldown time is 30s by 
default. It means, flink job will rescale job 30 seconds after 
updateJobResourceRequirements is called.
   
   So the running tasks are old parallelism when we call 
waitForRunningTasks(restClusterClient, jobID, parallelism2);.
   
   IIUC, it cannot be guaranteed, and it's unexpected.
   
    
   ## How to reproduce this bug?
   
   
https://github.com/1996fanrui/flink/commit/ffd713e24d37db2c103e4cd4361d0cd916d0d2f6
   
   1. Disable the cooldown
   2. Sleep for a while before waitForRunningTasks
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to