[jira] [Commented] (SPARK-31173) Spark Kubernetes add tolerations and nodeName support

2020-03-23 Thread zhongwei liu (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-31173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17064563#comment-17064563
 ] 

zhongwei liu commented on SPARK-31173:
--

[~seedjeffwan] The first one is the key point.

> Spark Kubernetes add tolerations and nodeName support
> -
>
> Key: SPARK-31173
> URL: https://issues.apache.org/jira/browse/SPARK-31173
> Project: Spark
>  Issue Type: New Feature
>  Components: Kubernetes
>Affects Versions: 3.1.0, 2.4.6
> Environment: Alibaba Cloud ACK with spark 
> operator(v1beta2-1.1.0-2.4.5) and spark(2.4.5)
>Reporter: zhongwei liu
>Priority: Trivial
>  Labels: features
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> When you run spark on serverless kubernetes cluster(virtual-kubelet). you 
> need to specific the nodeSelectors,tolerations even nodeName when you want to 
> gain better scheduling performance. Currently spark doesn't support 
> tolerations. If you want to use this feature, You must use admission 
> controller webhook to decorate the pod. But the performance is extremely bad. 
> Here is the benchmark. 
> With webhook 
> Batch Size: 500 Pod creation: about 7 Pods/s   All Pods running: 5min
> Without webhook 
> Batch Size: 500 Pod creation: more than 500 Pods/s All Pods running: 45s
> Adding tolerations and nodeName in spark will bring great help when you want 
> to run a large scale job on serverless kubernetes cluster.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-31173) Spark Kubernetes add tolerations and nodeName support

2020-03-17 Thread zhongwei liu (Jira)
zhongwei liu created SPARK-31173:


 Summary: Spark Kubernetes add tolerations and nodeName support
 Key: SPARK-31173
 URL: https://issues.apache.org/jira/browse/SPARK-31173
 Project: Spark
  Issue Type: New Feature
  Components: Kubernetes
Affects Versions: 3.1.0, 2.4.6
 Environment: Alibaba Cloud ACK with spark 
operator(v1beta2-1.1.0-2.4.5) and spark(2.4.5)
Reporter: zhongwei liu


When you run spark on serverless kubernetes cluster(virtual-kubelet). you need 
to specific the nodeSelectors,tolerations even nodeName when you want to gain 
better scheduling performance. Currently spark doesn't support tolerations. If 
you want to use this feature, You must use admission controller webhook to 
decorate the pod. But the performance is extremely bad. Here is the benchmark. 

With webhook 

Batch Size: 500 Pod creation: about 7 Pods/s   All Pods running: 5min

Without webhook 

Batch Size: 500 Pod creation: more than 500 Pods/s All Pods running: 45s

Adding tolerations and nodeName in spark will bring great help when you want to 
run a large scale job on serverless kubernetes cluster.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org