[
https://issues.apache.org/jira/browse/SPARK-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
caoxuewen updated SPARK-20950:
------------------------------
Description:
This PR Improvement in two:
1.With spark.shuffle.spill.diskWriteBufferSize configure diskWriteBufferSize of
ShuffleExternalSorter.
when change the size of the diskWriteBufferSize to test forceSorterToSpill
The average performance of running 10 times is as follows:(their unit is MS).
~diskWriteBufferSize: 1M 512K 256K 128K 64K 32K 16K
8K 4K
---------------------------------------------------------------------------------------
RecordSize = 2.5M 742 722 694 686 667 668 671
669 683
RecordSize = 1M 294 293 292 287 283 285 281
279 285~
2.Remove outputBufferSizeInBytes and inputBufferSizeInBytes to initialize in
mergeSpillsWithFileStream function
was:
This PR Improvement in two:
1.With spark.shuffle.spill.diskWriteBufferSize configure diskWriteBufferSize of
ShuffleExternalSorter.
when change the size of the diskWriteBufferSize to test forceSorterToSpill
The average performance of running 10 times is as follows:(their unit is MS).
diskWriteBufferSize: 1M 512K 256K 128K 64K 32K 16K
8K 4K
---------------------------------------------------------------------------------------
RecordSize = 2.5M 742 722 694 686 667 668 671
669 683
RecordSize = 1M 294 293 292 287 283 285 281
279 285
2.Remove outputBufferSizeInBytes and inputBufferSizeInBytes to initialize in
mergeSpillsWithFileStream function
> add a new config to diskWriteBufferSize which is hard coded before
> ------------------------------------------------------------------
>
> Key: SPARK-20950
> URL: https://issues.apache.org/jira/browse/SPARK-20950
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 2.2.0
> Reporter: caoxuewen
> Priority: Trivial
>
> This PR Improvement in two:
> 1.With spark.shuffle.spill.diskWriteBufferSize configure diskWriteBufferSize
> of ShuffleExternalSorter.
> when change the size of the diskWriteBufferSize to test forceSorterToSpill
> The average performance of running 10 times is as follows:(their unit is MS).
> ~diskWriteBufferSize: 1M 512K 256K 128K 64K 32K 16K
> 8K 4K
> ---------------------------------------------------------------------------------------
> RecordSize = 2.5M 742 722 694 686 667 668 671
> 669 683
> RecordSize = 1M 294 293 292 287 283 285 281
> 279 285~
> 2.Remove outputBufferSizeInBytes and inputBufferSizeInBytes to initialize in
> mergeSpillsWithFileStream function
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]