[ 
https://issues.apache.org/jira/browse/SPARK-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

caoxuewen updated SPARK-20950:
------------------------------
    Description: 
This PR Improvement in two:
1.With spark.shuffle.spill.diskWriteBufferSize configure diskWriteBufferSize of 
ShuffleExternalSorter.
when change the size of the diskWriteBufferSize to test forceSorterToSpill
The average performance of running 10 times is as follows:(their unit is MS).
bq. 
bq. diskWriteBufferSize:       1M    512K    256K    128K    64K    32K    16K  
  8K    4K
bq. 
---------------------------------------------------------------------------------------
bq. RecordSize = 2.5M          742   722     694     686     667    668    671  
  669   683
bq. RecordSize = 1M            294   293     292     287     283    285    281  
  279   285
bq. 
2.Remove outputBufferSizeInBytes and inputBufferSizeInBytes to initialize in 
mergeSpillsWithFileStream function

  was:
This PR Improvement in two:
1.With spark.shuffle.spill.diskWriteBufferSize configure diskWriteBufferSize of 
ShuffleExternalSorter.
when change the size of the diskWriteBufferSize to test forceSorterToSpill
The average performance of running 10 times is as follows:(their unit is MS).

{quote}diskWriteBufferSize:       1M    512K    256K    128K    64K    32K    
16K    8K    4K
---------------------------------------------------------------------------------------
RecordSize = 2.5M          742   722     694     686     667    668    671    
669   683
RecordSize = 1M            294   293     292     287     283    285    281    
279   285{quote}

2.Remove outputBufferSizeInBytes and inputBufferSizeInBytes to initialize in 
mergeSpillsWithFileStream function


> add a new config to diskWriteBufferSize which is hard coded before
> ------------------------------------------------------------------
>
>                 Key: SPARK-20950
>                 URL: https://issues.apache.org/jira/browse/SPARK-20950
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: caoxuewen
>            Priority: Trivial
>
> This PR Improvement in two:
> 1.With spark.shuffle.spill.diskWriteBufferSize configure diskWriteBufferSize 
> of ShuffleExternalSorter.
> when change the size of the diskWriteBufferSize to test forceSorterToSpill
> The average performance of running 10 times is as follows:(their unit is MS).
> bq. 
> bq. diskWriteBufferSize:       1M    512K    256K    128K    64K    32K    
> 16K    8K    4K
> bq. 
> ---------------------------------------------------------------------------------------
> bq. RecordSize = 2.5M          742   722     694     686     667    668    
> 671    669   683
> bq. RecordSize = 1M            294   293     292     287     283    285    
> 281    279   285
> bq. 
> 2.Remove outputBufferSizeInBytes and inputBufferSizeInBytes to initialize in 
> mergeSpillsWithFileStream function



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to