Github user susanxhuynh commented on a diff in the pull request:
https://github.com/apache/spark/pull/20640#discussion_r169676540
--- Diff:
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
---
@@ -648,15 +645,6 @@ private[spark] class
MesosCoarseGrainedSchedulerBackend(
totalGpusAcquired -= gpus
gpusByTaskId -= taskId
}
- // If it was a failure, mark the slave as failed for blacklisting
purposes
- if (TaskState.isFailed(state)) {
- slave.taskFailures += 1
-
- if (slave.taskFailures >= MAX_SLAVE_FAILURES) {
- logInfo(s"Blacklisting Mesos slave $slaveId due to too many
failures; " +
--- End diff --
@IgorBerman Yes, in the default case, it would be nice to have this
information about a task failing, especially if it fails repeatedly.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]