dhercher commented on pull request #15246:
URL: https://github.com/apache/beam/pull/15246#issuecomment-893006636


   My issue is that the reshuffle is having the inverse effect, since we force 
full parallelism at this stage its very easy to cause OOM crash loops when 
reading many files at once.
   
   At least in the Dataflow runner, it does not appear it knows how to properly 
scale down the number of threads when this sort of issue occurs to avoid the 
issue.  Removing the reshuffle at least allows for someone to force their 
desired behavior when they know what they want rather than force a single 
aggressive strategy.
   
   I suppose the feature flag could control if the reshuffle occurs at all to 
maintain the current behavior by default and allow the user to manage the 
parallelism where needed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to