[GitHub] [spark] siknezevic commented on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

GitBox Fri, 19 Jun 2020 17:17:15 -0700


siknezevic commented on pull request #27246:
URL: https://github.com/apache/spark/pull/27246#issuecomment-646904374



   > > Could you please let me know would it be OK to hard-code the read buffer 
size to 1024?
   > 
   > You think the performance is independent of running platforms, e.g., CPU 
arch and disk I/O? If its independent, the hard-coded looks okay.
   
   Micro-benchmark was done on my dev box (single machine). 10TB benchmark was 
done on real 6+1 nodes (physical Ubuntu machines) cluster with fast storage and 
performance hit is visible in both cases. So, I think it is independent of 
running platforms.  I will submit new patch soon. Thank you


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] siknezevic commented on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

Reply via email to