Github user nezihyigitbasi commented on a diff in the pull request:
https://github.com/apache/spark/pull/11241#discussion_r53529534
--- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala
---
@@ -135,6 +135,11 @@ private[spark] class BlockManager(
// Whether to compress shuffle output temporarily spilled to disk
private val compressShuffleSpill =
conf.getBoolean("spark.shuffle.spill.compress", true)
+ // Max number of failures before this block manager refreshes the block
locations from the driver
+ private val maxFailuresBeforeLocationRefresh =
+ conf.getInt("spark.block.failures.beforeLocationRefresh", Int.MaxValue)
--- End diff --
If we really want to make this constant I think it should be < 10, because
with the default retry wait/count settings 10 failures correspond to 15*10 =
2.5 minutes just looping through removed executors.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]