[GitHub] spark pull request #19836: [SPARK-22618][CORE] Catch exception in removeRDD ...

brad-kaiser Wed, 06 Dec 2017 07:24:39 -0800

Github user brad-kaiser commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19836#discussion_r155267282
  
    --- Diff: 
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala 
---
    @@ -159,11 +160,16 @@ class BlockManagerMasterEndpoint(
         // Ask the slaves to remove the RDD, and put the result in a sequence 
of Futures.
         // The dispatcher is used as an implicit argument into the Future 
sequence construction.
         val removeMsg = RemoveRdd(rddId)
    -    Future.sequence(
    -      blockManagerInfo.values.map { bm =>
    -        bm.slaveEndpoint.ask[Int](removeMsg)
    -      }.toSeq
    -    )
    +
    +    val futures = blockManagerInfo.values.map { bm =>
    +      bm.slaveEndpoint.ask[Int](removeMsg).recover {
    +        case e: IOException =>
    --- End diff --
    
    I think the logic for catching the error still applies even without dynamic 
allocation. If one of your nodes goes down while you happen to be in 
.unpersist, you wouldn't want your whole job to fail. 
    
    Dynamic allocation just makes this scenario more likely.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19836: [SPARK-22618][CORE] Catch exception in removeRDD ...

Reply via email to