boy-uber commented on pull request #29916: URL: https://github.com/apache/spark/pull/29916#issuecomment-702798048
> so I believe the reason for this was that the external shuffle is assuming certain things - specifically getSortBasedShuffleBlockData, so if the shuffle manager does not support that similar then weird things could happen because it assume a certain file naming and layout. > now this code is old from when Spark had other shuffle managers - hash shuffle manager > I think a lot of the problem is while spark has a config to replace the shuffle manager it isn't really well documented or designed to have appropriate pieces public/private, etc. That is why so many people are working on it. > > I'd like to at least see documentation put in the description of the class that it relies on that so its more obvious to the user. Yes, good suggestion! Agree that current shuffle manager isn't really well documented or designed to have appropriate pieces public/private. We could improve that as well. I will add some simple documentation in the class. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
