[ 
https://issues.apache.org/jira/browse/SPARK-20950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

caoxuewen updated SPARK-20950:
------------------------------
    Description: 
This PR Improvement in two:
1.With spark.shuffle.spill.diskWriteBufferSize configure diskWriteBufferSize of 
ShuffleExternalSorter.
when change the size of the diskWriteBufferSize to test forceSorterToSpill
The average performance of running 10 times is as follows:(their unit is MS).
 
{quote}diskWriteBufferSize:       1M    512K    256K    128K    64K    32K    
16K    8K    4K
---------------------------------------------------------------------------------------
RecordSize = 2.5M          742   722     694     686     667    668    671    
669   683
RecordSize = 1M            294   293     292     287     283    285    281    
279   285{quote}
 
2.Remove outputBufferSizeInBytes and inputBufferSizeInBytes to initialize in 
mergeSpillsWithFileStream function

  was:
This PR Improvement in two:
1.With spark.shuffle.spill.diskWriteBufferSize configure diskWriteBufferSize of 
ShuffleExternalSorter.
when change the size of the diskWriteBufferSize to test forceSorterToSpill
The average performance of running 10 times is as follows:(their unit is MS).
bq. 
bq. diskWriteBufferSize:       1M    512K    256K    128K    64K    32K    16K  
  8K    4K
bq. 
---------------------------------------------------------------------------------------
bq. RecordSize = 2.5M          742   722     694     686     667    668    671  
  669   683
bq. RecordSize = 1M            294   293     292     287     283    285    281  
  279   285
bq. 
2.Remove outputBufferSizeInBytes and inputBufferSizeInBytes to initialize in 
mergeSpillsWithFileStream function


> add a new config to diskWriteBufferSize which is hard coded before
> ------------------------------------------------------------------
>
>                 Key: SPARK-20950
>                 URL: https://issues.apache.org/jira/browse/SPARK-20950
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: caoxuewen
>            Priority: Trivial
>
> This PR Improvement in two:
> 1.With spark.shuffle.spill.diskWriteBufferSize configure diskWriteBufferSize 
> of ShuffleExternalSorter.
> when change the size of the diskWriteBufferSize to test forceSorterToSpill
> The average performance of running 10 times is as follows:(their unit is MS).
>  
> {quote}diskWriteBufferSize:       1M    512K    256K    128K    64K    32K    
> 16K    8K    4K
> ---------------------------------------------------------------------------------------
> RecordSize = 2.5M          742   722     694     686     667    668    671    
> 669   683
> RecordSize = 1M            294   293     292     287     283    285    281    
> 279   285{quote}
>  
> 2.Remove outputBufferSizeInBytes and inputBufferSizeInBytes to initialize in 
> mergeSpillsWithFileStream function



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to