GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/11660

    [SPARK-XXXXX] Guard against race condition when re-caching disk blocks in 
memory

    When reading data from the DiskStore and attempting to cache it back into 
the memory store, we should guard against race conditions where multiple 
readers are attempting to re-cache the same block in memory.
    
    This patch accomplishes this by synchronizing on the block's `BlockInfo` 
object while trying to re-cache a block.
    
    (Will file JIRA as soon as ASF JIRA stops being down / laggy).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark concurrent-recaching-fixes

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11660.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11660
    
----
commit 00ea8d350eeb6bfe39c809d9f703a17ef710618c
Author: Josh Rosen <[email protected]>
Date:   2016-03-11T21:22:09Z

    De-duplicate disk -> memory caching code.

commit a0c68e20d1ef86eded51b9212e0c888acf5955e1
Author: Josh Rosen <[email protected]>
Date:   2016-03-11T21:29:56Z

    Clarify that read lock must be held by caller of maybeCache*

commit 7f678d25ba6a8700917093e13896dbb255241fd3
Author: Josh Rosen <[email protected]>
Date:   2016-03-11T21:44:21Z

    Synchronize on blockInfo to guard against concurrent re-caching.

commit 5342712afeeb76ac8c30bb4bb884dc0ba900fb92
Author: Josh Rosen <[email protected]>
Date:   2016-03-11T21:47:53Z

    Add some BlockManager.dispose() calls to free disk buffer earlier.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to