austince edited a comment on pull request #15348:
URL: https://github.com/apache/flink/pull/15348#issuecomment-806116899


   > After a discussion with @tillrohrmann we also found another issue:
   > If the StateAssignmentOperation is able to set the maxParallelism, then 
the reactive mode cannot make a correct decision as to what the actual max 
parallelism is. It will come up with some initial parallelism, but this can 
easily exceed the maxParallelism of the savepoint.
   
   Good point – in a follow-up with @tillrohrmann, the workaround, for now, 
seems to be ensuring that the StateAssignmentOperation does not decrease the 
`maxParallelism` that has been declared during scheduling (either by the user 
or the system) in Reactive scheduling mode. I think this could happen in one of 
two ways, both indicating a user error:
   
   1.
   - the user configures a max parallelism lower than the autoConf default
   - the user removes the explicit max parallelism and upgrades job
   - the system defaults to higher and schedules
   - state assigment restores lower (the previously set user value)
   2.
   - user does not set a max parallelism, system autoConf
   - user sets an explicit max parallelism greater than the autoConf default 
and upgrades job
   - the system uses the explicit higher max parallelism and schedules
   - state assigment restores lower (the previously set autoConf default)
   
   In both situations, the situation occurs because the user changes the max 
parallelism and restores from state, which is a user error [noted in the 
docs](https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/production_ready/#set-an-explicit-max-parallelism):
 __"There is currently no way to change the maximum parallelism of an operator 
after a job has started without discarding that operators state."__
   
   Therefore, I think it is ok to throw an `IllegalStateException` in this 
situation. What do you think @zentol? Did I miss a situation?
   
   In the longer term, we'll want to move restoring the `maxParallelism` from 
state to somewhere else, perhaps before scheduling.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to