zentol commented on pull request #15497: URL: https://github.com/apache/flink/pull/15497#issuecomment-815341717
> While it fixes the RescalingITCase , it also introduces the issue that we may use a smaller maxParallelism than we initially used for requesting resources, when the AdaptiveScheduler is used without reactive mode. So even if the max parallelism was set to 8 based on the savepoint information, the scheduler will still initially ask for 128+ slots, and hold on to them until the job terminates. Thinking back on it, this may be incorrect (duh). With reactive mode we reject this case because we used a higher max paralllelism than what was set in the savepoint. Without reactive mode, we define the initial requirements based on the parallelism, not max parallelism. So long as the parallelism does not exceed the initially derived or set max parallelism the job will run fine without wasting resources. If it exceeds the max parallelism set in the savepoint then the job will fail, which is fine because that are the semantics of the max parallelism. God this issue is messing with my brain... 🤯 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
