Onur Satici created SPARK-30949:
-----------------------------------
Summary: Driver cores in kubernetes are coupled with container
resources, not spark.driver.cores
Key: SPARK-30949
URL: https://issues.apache.org/jira/browse/SPARK-30949
Project: Spark
Issue Type: Dependency upgrade
Components: Kubernetes
Affects Versions: 3.0.0
Reporter: Onur Satici
Drivers submitted in kubernetes cluster mode set the parallelism of various
components like 'RpcEnv', 'MemoryManager', 'BlockManager' from inferring the
number of available cores by calling:
{code:java}
Runtime.getRuntime().availableProcessors()
{code}
By using this, spark applications running on java 8 or older incorrectly get
the total number of cores in the host, [ignoring the cgroup limits set by
kubernetes|[https://bugs.openjdk.java.net/browse/JDK-6515172]]. Java 9 and
newer runtimes do not have this problem.
Orthogonal to this, it is currently not possible to decouple resource limits on
the driver container with the amount of parallelism of the various network and
memory components listed above.
My proposal is to use the 'spark.driver.cores' configuration to get the amount
of parallelism, [like we do for
YARN|[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L2762-L2767]].
This will enable users to specify 'spark.driver.cores' to set parallelism, and
specify 'spark.kubernetes.driver.requests.cores' to limit the resource requests
of the driver container. Further, this will remove the need to call
'availableProcessors()', thus the same number of cores will be used for
parallelism independent of the java runtime version.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]