GitHub user jinxing64 opened a pull request:

    https://github.com/apache/spark/pull/20685

    [SPARK-23524] Big local shuffle blocks should not be checked for corruption.

    ## What changes were proposed in this pull request?
    
    In current code, all local blocks will be checked for corruption no matter 
it's big or not.  The reasons are as below:
    
    Size in FetchResult for local block is set to be 0 
(https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala#L327)
    SPARK-4105 meant to only check the small blocks(size<maxBytesInFlight/3), 
but for reason 1, below check will be invalid. 
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala#L420
    
    We can fix this and avoid the OOM.
    
    ## How was this patch tested?
    
    UT added

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/jinxing64/spark SPARK-23524

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20685.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20685
    
----
commit 535916c045b123e803c0f6dbf786076045036167
Author: jx158167 <jx158167@...>
Date:   2018-02-27T09:56:38Z

    [SPARK-23524] Big local shuffle blocks should not be checked for corruption.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to