[GitHub] [spark] manuzhang commented on pull request #29797: [SPARK-32932][SQL] Do not use local shuffle reader on RepartitionByExpression when coalescing disabled

GitBox Tue, 22 Sep 2020 04:41:46 -0700


manuzhang commented on pull request #29797:
URL: https://github.com/apache/spark/pull/29797#issuecomment-696667483



   I mean the physical shuffle doesn't happen so that each shuffle task will 
generate at most `numReducers` files. The overall number will be `numMappers * 
numReducers`. 
   
   If we add a check, I'm not sure whether the local shuffle reader will ever 
be applied in practice.  In our use cases, the target bucket tables usually 
have more than 1000 buckets. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] manuzhang commented on pull request #29797: [SPARK-32932][SQL] Do not use local shuffle reader on RepartitionByExpression when coalescing disabled

Reply via email to