[ https://issues.apache.org/jira/browse/SPARK-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
caoxuewen updated SPARK-20950: ------------------------------ Description: This PR Improvement in two: 1.With spark.shuffle.spill.diskWriteBufferSize configure diskWriteBufferSize of ShuffleExternalSorter. when change the size of the diskWriteBufferSize to test forceSorterToSpill The average performance of running 10 times is as follows:(their unit is MS). bq. bq. diskWriteBufferSize: 1M 512K 256K 128K 64K 32K 16K 8K 4K bq. --------------------------------------------------------------------------------------- bq. RecordSize = 2.5M 742 722 694 686 667 668 671 669 683 bq. RecordSize = 1M 294 293 292 287 283 285 281 279 285 bq. 2.Remove outputBufferSizeInBytes and inputBufferSizeInBytes to initialize in mergeSpillsWithFileStream function was: This PR Improvement in two: 1.With spark.shuffle.spill.diskWriteBufferSize configure diskWriteBufferSize of ShuffleExternalSorter. when change the size of the diskWriteBufferSize to test forceSorterToSpill The average performance of running 10 times is as follows:(their unit is MS). {quote}diskWriteBufferSize: 1M 512K 256K 128K 64K 32K 16K 8K 4K --------------------------------------------------------------------------------------- RecordSize = 2.5M 742 722 694 686 667 668 671 669 683 RecordSize = 1M 294 293 292 287 283 285 281 279 285{quote} 2.Remove outputBufferSizeInBytes and inputBufferSizeInBytes to initialize in mergeSpillsWithFileStream function > add a new config to diskWriteBufferSize which is hard coded before > ------------------------------------------------------------------ > > Key: SPARK-20950 > URL: https://issues.apache.org/jira/browse/SPARK-20950 > Project: Spark > Issue Type: Improvement > Components: Spark Core > Affects Versions: 2.2.0 > Reporter: caoxuewen > Priority: Trivial > > This PR Improvement in two: > 1.With spark.shuffle.spill.diskWriteBufferSize configure diskWriteBufferSize > of ShuffleExternalSorter. > when change the size of the diskWriteBufferSize to test forceSorterToSpill > The average performance of running 10 times is as follows:(their unit is MS). > bq. > bq. diskWriteBufferSize: 1M 512K 256K 128K 64K 32K > 16K 8K 4K > bq. > --------------------------------------------------------------------------------------- > bq. RecordSize = 2.5M 742 722 694 686 667 668 > 671 669 683 > bq. RecordSize = 1M 294 293 292 287 283 285 > 281 279 285 > bq. > 2.Remove outputBufferSizeInBytes and inputBufferSizeInBytes to initialize in > mergeSpillsWithFileStream function -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org