[jira] [Updated] (SPARK-30949) Driver cores in kubernetes are coupled with container resources, not spark.driver.cores

Onur Satici (Jira) Tue, 25 Feb 2020 08:40:33 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-30949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Onur Satici updated SPARK-30949:
--------------------------------
    Description: 
Drivers submitted in kubernetes cluster mode set the parallelism of various 
components like 'RpcEnv', 'MemoryManager', 'BlockManager' from inferring the 
number of available cores by calling:
{code:java}
Runtime.getRuntime().availableProcessors()
{code}
By using this, spark applications running on java 8 or older incorrectly get 
the total number of cores in the host, ignoring the cgroup limits set by 
kubernetes (https://bugs.openjdk.java.net/browse/JDK-6515172). Java 9 and newer 
runtimes do not have this problem.

Orthogonal to this, it is currently not possible to decouple resource limits on 
the driver container with the amount of parallelism of the various network and 
memory components listed above.

My proposal is to use the 'spark.driver.cores' configuration to get the amount 
of parallelism, like we do for YARN 
(https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L2762-L2767).
 This will enable users to specify 'spark.driver.cores' to set parallelism, and 
specify 'spark.kubernetes.driver.requests.cores' to limit the resource requests 
of the driver container. Further, this will remove the need to call 
'availableProcessors()', thus the same number of cores will be used for 
parallelism independent of the java runtime version.

 

  was:
Drivers submitted in kubernetes cluster mode set the parallelism of various 
components like 'RpcEnv', 'MemoryManager', 'BlockManager' from inferring the 
number of available cores by calling:
{code:java}
Runtime.getRuntime().availableProcessors()
{code}
By using this, spark applications running on java 8 or older incorrectly get 
the total number of cores in the host, [ignoring the cgroup limits set by 
kubernetes|[https://bugs.openjdk.java.net/browse/JDK-6515172]]. Java 9 and 
newer runtimes do not have this problem.

Orthogonal to this, it is currently not possible to decouple resource limits on 
the driver container with the amount of parallelism of the various network and 
memory components listed above.

My proposal is to use the 'spark.driver.cores' configuration to get the amount 
of parallelism, [like we do for 
YARN|[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L2762-L2767]].
 This will enable users to specify 'spark.driver.cores' to set parallelism, and 
specify 'spark.kubernetes.driver.requests.cores' to limit the resource requests 
of the driver container. Further, this will remove the need to call 
'availableProcessors()', thus the same number of cores will be used for 
parallelism independent of the java runtime version.

 


> Driver cores in kubernetes are coupled with container resources, not 
> spark.driver.cores
> ---------------------------------------------------------------------------------------
>
>                 Key: SPARK-30949
>                 URL: https://issues.apache.org/jira/browse/SPARK-30949
>             Project: Spark
>          Issue Type: Dependency upgrade
>          Components: Kubernetes
>    Affects Versions: 3.0.0
>            Reporter: Onur Satici
>            Priority: Major
>
> Drivers submitted in kubernetes cluster mode set the parallelism of various 
> components like 'RpcEnv', 'MemoryManager', 'BlockManager' from inferring the 
> number of available cores by calling:
> {code:java}
> Runtime.getRuntime().availableProcessors()
> {code}
> By using this, spark applications running on java 8 or older incorrectly get 
> the total number of cores in the host, ignoring the cgroup limits set by 
> kubernetes (https://bugs.openjdk.java.net/browse/JDK-6515172). Java 9 and 
> newer runtimes do not have this problem.
> Orthogonal to this, it is currently not possible to decouple resource limits 
> on the driver container with the amount of parallelism of the various network 
> and memory components listed above.
> My proposal is to use the 'spark.driver.cores' configuration to get the 
> amount of parallelism, like we do for YARN 
> (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L2762-L2767).
>  This will enable users to specify 'spark.driver.cores' to set parallelism, 
> and specify 'spark.kubernetes.driver.requests.cores' to limit the resource 
> requests of the driver container. Further, this will remove the need to call 
> 'availableProcessors()', thus the same number of cores will be used for 
> parallelism independent of the java runtime version.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-30949) Driver cores in kubernetes are coupled with container resources, not spark.driver.cores

Reply via email to