GitHub user shubhamchopra opened a pull request:

    https://github.com/apache/spark/pull/17325

    [SPARK-19803][CORE][TEST] Proactive replication test failures

    ## What changes were proposed in this pull request?
    Executors cache a list of their peers that is refreshed by default every 
minute. The cached stale references were randomly being used for replication. 
Since those executors were removed from the master, they did not occur in the 
block locations as reported by the master. This was fixed by
    1. Refreshing peer cache in the block manager before trying to pro-actively 
replicate. This way the probability of replicating to a failed executor is 
eliminated.
    2. Explicitly stopping the block manager in the tests. This shuts down the 
RPC endpoint use by the block manager. This way, even if a block manager tries 
to replicate using a stale reference, the replication logic should take care of 
refreshing the list of peers after failure.
    
    
    ## How was this patch tested?
    Tested manually


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/shubhamchopra/spark SPARK-19803

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/17325.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #17325
    
----
commit 22f9dbd6825939f93f8d32b3ec428f890d361d9f
Author: Shubham Chopra <schopr...@bloomberg.net>
Date:   2017-03-16T22:14:23Z

    Fixing an issue with executors using stale peer references to replicate.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to