[
https://issues.apache.org/jira/browse/SPARK-34389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282994#comment-17282994
]
Attila Zsolt Piros commented on SPARK-34389:
--------------------------------------------
[~ranju] as I see your question is more about the timeout length of the POD
allocation.
It was added by this PR: [https://github.com/apache/spark/pull/30155
|https://github.com/apache/spark/pull/30155]and the whole PR is about this
timeout.
You can see from the description the default value is chosen by using real
world clusters under load and you can increase it if needed.
> Spark job on Kubernetes scheduled For Zero or less than minimum number of
> executors and Wait indefinitely under resource starvation
> -----------------------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-34389
> URL: https://issues.apache.org/jira/browse/SPARK-34389
> Project: Spark
> Issue Type: Bug
> Components: Kubernetes
> Affects Versions: 3.0.1
> Reporter: Ranju
> Priority: Major
> Attachments: DriverLogs_ExecutorLaunchedLessThanMinExecutor.txt,
> Steps to reproduce.docx
>
>
> In case Cluster does not have sufficient resource (CPU/ Memory ) for minimum
> number of executors , the executors goes in Pending State for indefinite time
> until the resource gets free.
> Suppose, Cluster Configurations are:
> total Memory=204Gi
> used Memory=200Gi
> free memory= 4Gi
> SPARK.EXECUTOR.MEMORY=10G
> SPARK.DYNAMICALLOCTION.MINEXECUTORS=4
> SPARK.DYNAMICALLOCATION.MAXEXECUTORS=8
> Rather, the job should be cancelled if requested number of minimum executors
> are not available at that point of time because of resource unavailability.
> Currently it is doing partial scheduling or no scheduling and waiting
> indefinitely. And the job got stuck.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]