cloud-fan commented on issue #22163: [SPARK-25166][CORE]Reduce the number of 
write operations for shuffle write.
URL: https://github.com/apache/spark/pull/22163#issuecomment-534013793
 
 
   > Currently, only one record is written to a buffer each time, which 
increases the number of copies.
   
   This is very confusing. If this is true I don't think Spark shuffle can have 
reasonable performance.
   
   By looking at the code, it seems what you try to do is to not flush the 
buffer to disk when seeing a new partition. We can keep writing to the buffer 
if it's not full, even if we hit a new partition. Can you update the PR 
description to be more clear?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to