zentol commented on pull request #15497:
URL: https://github.com/apache/flink/pull/15497#issuecomment-815341717


   > While it fixes the RescalingITCase , it also introduces the issue that we 
may use a smaller maxParallelism than we initially used for requesting 
resources, when the AdaptiveScheduler is used without reactive mode. So even if 
the max parallelism was set to 8 based on the savepoint information, the 
scheduler will still initially ask for 128+ slots, and hold on to them until 
the job terminates.
   
   Thinking back on it, this may be incorrect (duh).
   With reactive mode we reject this case because we used a higher max 
paralllelism than what was set in the savepoint.
   Without reactive mode, we define the initial requirements based on the 
parallelism, not max parallelism. So long as the parallelism does not exceed 
the initially derived or set max parallelism the job will run fine without 
wasting resources. If it exceeds the max parallelism set in the savepoint then 
the job will fail, which is fine because that are the semantics of the max 
parallelism.
   
   God this issue is messing with my brain... 🤯 
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to