[ 
https://issues.apache.org/jira/browse/SPARK-55888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-55888:
-----------------------------------
    Labels: pull-request-available  (was: )

> Parallel pod creation for Kubernetes executor allocation
> --------------------------------------------------------
>
>                 Key: SPARK-55888
>                 URL: https://issues.apache.org/jira/browse/SPARK-55888
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 4.1.1
>            Reporter: Dejiu Lu
>            Priority: Major
>              Labels: pull-request-available
>
> Right now, `ExecutorPodsAllocator` creates pods one by one — it sends a 
> create request to the K8s API, waits for the response, then moves on to the 
> next one. This works fine for small jobs, but becomes a real bottleneck when 
> you need thousands of executors. 
> In our production environment, it takes 3412s to request 10,000 executors in 
> Kubernetes mode, but only 34s in YARN mode. We achieved a significant speedup 
> by changing the executor request from serial to parallel.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to