GitHub user squito opened a pull request:

    https://github.com/apache/spark/pull/9214

    [SPARK-8029][core][wip] first successful shuffle task always wins

    Shuffle writers now write to temp files, and when they are done, they 
atomically move those files into the final location *if those files don't 
already exist*.  This way, if one executor ends up executing more than one task 
to generate shuffle output for one partition, the first successful one "wins", 
and all others are ignored.
    
    TODO
    - [ ] make sure I'm using the right compression / temp block sizes, per 
SPARK-3426
    - [ ] run some fault-injection tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/squito/spark SPARK-8029_first_wins

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9214.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9214
    
----
commit 6140e426f045967e107451336005887e144f6e39
Author: Imran Rashid <[email protected]>
Date:   2015-10-21T19:26:26Z

    ShuffleWriters write to temp file, then go through
    ShuffleOutputCoordinator to atomically move w/ "first one wins"

commit 5854ac8a68474b595c9f02d895f2bb3c2eb59c5a
Author: Imran Rashid <[email protected]>
Date:   2015-10-22T03:17:17Z

    assorted cleanup

commit c3e4456788e4f6a10d07f5ff47eb4d6a8d19f543
Author: Imran Rashid <[email protected]>
Date:   2015-10-22T03:19:07Z

    style

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to