GitHub user heary-cao opened a pull request:

    https://github.com/apache/spark/pull/19693

    [CORE]improved statistical shuffle write time

    ## What changes were proposed in this pull request?
    
    Creating the file to write to and creating a disk writer both involve 
interacting with the disk, and can take a long time when we open or close many 
files, so should be included in the shuffle write time.
    
    so we call mergeSpillsWithTransferTo, only contains the write file the 
time, but did not included in the shuffle write time when open and close many 
merges spill files .
    
    ## How was this patch tested?
    
    existed test cases.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/heary-cao/spark task_statistics

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19693.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19693
    
----
commit e1d6df4cecc757a7f66feefa2e3bd6816e7abd3f
Author: caoxuewen <[email protected]>
Date:   2017-11-08T07:57:28Z

    improved statistical shuffle write time

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to