Github user brad-kaiser commented on a diff in the pull request:
https://github.com/apache/spark/pull/19836#discussion_r155267282
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala
---
@@ -159,11 +160,16 @@ class BlockManagerMasterEndpoint(
// Ask the slaves to remove the RDD, and put the result in a sequence
of Futures.
// The dispatcher is used as an implicit argument into the Future
sequence construction.
val removeMsg = RemoveRdd(rddId)
- Future.sequence(
- blockManagerInfo.values.map { bm =>
- bm.slaveEndpoint.ask[Int](removeMsg)
- }.toSeq
- )
+
+ val futures = blockManagerInfo.values.map { bm =>
+ bm.slaveEndpoint.ask[Int](removeMsg).recover {
+ case e: IOException =>
--- End diff --
I think the logic for catching the error still applies even without dynamic
allocation. If one of your nodes goes down while you happen to be in
.unpersist, you wouldn't want your whole job to fail.
Dynamic allocation just makes this scenario more likely.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]