Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/21033#discussion_r181627739
--- Diff:
resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
---
@@ -495,9 +500,8 @@ private[spark] class MesosCoarseGrainedSchedulerBackend(
launchTasks = true
val taskId = newMesosTaskId()
val offerCPUs = getResource(resources, "cpus").toInt
- val taskGPUs = Math.min(
- Math.max(0, maxGpus - totalGpusAcquired),
getResource(resources, "gpus").toInt)
-
+ val offerGPUs = getResource(resources, "gpus").toInt
+ var taskGPUs = executorGpus
--- End diff --
so looks like we are changing the behavior for the value set in
`spark.mesos.gpus.max` (since 2.1)? we are ok with that/that might break
existing deployment? is there a migration guide for something like this?
in addition, is there other changes by default - specifically now taskGPUs
defaults to 0?
also, should we warn if `spark.mesos.executor.gpus` is >
`spark.mesos.gpus.max`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]