StephanEwen commented on pull request #13595:
URL: https://github.com/apache/flink/pull/13595#issuecomment-716404546


   @wsry 
   
   I am confused a bit about the use of max buffers. That value is the upper 
limit of buffers that will be assigned to the shuffle from the global pool. It 
is only used for streaming pipelined connections, to limit the amount of 
in-flight buffers (so checkpoints don't take too long). For batch it should not 
be used, because there is no harm in using as much memory as possible.
   
   The min buffers is actually hat decouples the memory use from the 
parallelism. If the min-buffers is related to the number of subpartitions, then 
we still have the problem that shuffles fail on large parallelism (parallelism 
is higher than available memory buffers).
   
   So, in conclusion, I think we should not have a max value (because it does 
not help in decoupling from parallelism) and also need to decouple the min 
value from the parallelism.
   
   I also liked the idea of having the 
`taskmanager.network.sort-shuffle.min-parallelism` as a flag. That way low 
parallelisms (< 50) could use the hash shuffle and larger parallelisms could 
use the sort shuffle.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to