[GitHub] [flink] austince edited a comment on pull request #15497: [FLINK-22084][runtime] Use a consistent default max parallelism in the Adaptive Scheduler

GitBox Wed, 07 Apr 2021 16:24:48 -0700


austince edited a comment on pull request #15497:
URL: https://github.com/apache/flink/pull/15497#issuecomment-815329177



   > Unless I'm missing anything, then here is an example where this happens:
   > 
   > ```
   > P1= 80 => MP1=128
   > P2=100 => MP2=256
   > ```
   > 
   > So similarly to option 2, with option 1 we still have this inconsistency 
that can very well break existing jobs when migrated to the adaptive scheduler, 
_or at some point in the future after migration_. The only way to prevent that 
is option 3, or, option 4: outright reject jobs that have not explicitly set 
the max parallelism.
   
   That is possible, I just created a test case that proves it. 😞 
   
   So, I think option 4 (require max parallelism to be set) would be the 
simplest to get in, and not a difficult constraint to communicate to users 
because a) Adaptive scheduler + Reactive Mode are new and "experimental" 
features, b) setting max parallelism on all operators is already documented as 
a best practice for production jobs, and c) there is a solid solution that can 
immediately be queued up for the next release (reading savepoints before 
creating the graph). I guess something @tillrohrmann + @knaufk (original ticket 
author, FLINK-21844) should weigh in on?
   
   I think option 3 would be a temporary solution and would get tricky, as 
there is no communication between the scheduler and the state restore at the 
moment, and there are quite a few layers in between. Unless I misunderstand the 
necessary updates for that option.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [flink] austince edited a comment on pull request #15497: [FLINK-22084][runtime] Use a consistent default max parallelism in the Adaptive Scheduler

Reply via email to