HyukjinKwon commented on a change in pull request #21092:
URL: https://github.com/apache/spark/pull/21092#discussion_r540240095
##########
File path:
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala
##########
@@ -154,6 +176,24 @@ private[spark] object Config extends Logging {
.checkValue(interval => interval > 0, s"Logging interval must be a
positive time value.")
.createWithDefaultString("1s")
+ val MEMORY_OVERHEAD_FACTOR =
+ ConfigBuilder("spark.kubernetes.memoryOverheadFactor")
+ .doc("This sets the Memory Overhead Factor that will allocate memory to
non-JVM jobs " +
+ "which in the case of JVM tasks will default to 0.10 and 0.40 for
non-JVM jobs")
+ .doubleConf
+ .checkValue(mem_overhead => mem_overhead >= 0 && mem_overhead < 1,
+ "Ensure that memory overhead is a double between 0 --> 1.0")
+ .createWithDefault(0.1)
+
+ val PYSPARK_MAJOR_PYTHON_VERSION =
+ ConfigBuilder("spark.kubernetes.pyspark.pythonversion")
Review comment:
Sorry for leaving a comment in an ancient PR but I couldn't hold it. Why
did we add a configuration to control Python version instead of using the
existent `PYSPARK_PYTHON` and `PYSPARK_DRIVER_PYTHON`?
Doing this in a configuration breaks or disables many things, for example,
PEX
(https://medium.com/criteo-labs/packaging-code-with-pex-a-pyspark-example-9057f9f144f3)
that requires to set `PYSPARK_PYTHON` and `PYSPARK_DRIVER_PYTHON` manually.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]