manuzhang commented on pull request #29797: URL: https://github.com/apache/spark/pull/29797#issuecomment-696667483
I mean the physical shuffle doesn't happen so that each shuffle task will generate at most `numReducers` files. The overall number will be `numMappers * numReducers`. If we add a check, I'm not sure whether the local shuffle reader will ever be applied in practice. In our use cases, the target bucket tables usually have more than 1000 buckets. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
