Github user galv commented on a diff in the pull request:
https://github.com/apache/spark/pull/21511#discussion_r198653002
--- Diff:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
---
@@ -104,6 +104,20 @@ private[spark] object Config extends Logging {
.stringConf
.createOptional
+ val KUBERNETES_EXECUTOR_LIMIT_GPUS =
--- End diff --
Sometimes you need it. For example, to reduce data across multiple
executors, you would ideally use ring all-reduce among your executors, but you
cannot really do that right now given that executors are scheduled
independently. The best you can do right now is to gather all of your data to
the driver and then do the reduction there. You can learn more at the SPIP for
project hydrogen/barrier execution.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]