[ 
https://issues.apache.org/jira/browse/SPARK-35460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17348543#comment-17348543
 ] 

Apache Spark commented on SPARK-35460:
--------------------------------------

User 'yaooqinn' has created a pull request for this issue:
https://github.com/apache/spark/pull/32610

>  invalid `spark.kubernetes.executor.podNamePrefix` causes app to hang
> ---------------------------------------------------------------------
>
>                 Key: SPARK-35460
>                 URL: https://issues.apache.org/jira/browse/SPARK-35460
>             Project: Spark
>          Issue Type: Improvement
>          Components: Kubernetes
>    Affects Versions: 3.1.1
>            Reporter: Kent Yao
>            Priority: Major
>
> {code:java}
> 21/05/20 21:41:21 WARN ExecutorPodsSnapshotsStoreImpl: Exception when 
> notifying snapshot subscriber.
> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: 
> POST at: 
> https://kubernetes.docker.internal:6443/api/v1/namespaces/default/pods. 
> Message: Pod "spark_exec-exec-688" is invalid: [metadata.name: Invalid value: 
> "spark_exec-exec-688": a DNS-1123 subdomain must consist of lower case 
> alphanumeric characters, '-' or '.', and must start and end with an 
> alphanumeric character (e.g. 'example.com', regex used for validation is 
> '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'), 
> spec.hostname: Invalid value: "spark_exec-exec-688": a DNS-1123 label must 
> consist of lower case alphanumeric characters or '-', and must start and end 
> with an alphanumeric character (e.g. 'my-name',  or '123-abc', regex used for 
> validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?')]. Received status: 
> Status(apiVersion=v1, code=422, 
> details=StatusDetails(causes=[StatusCause(field=metadata.name, 
> message=Invalid value: "spark_exec-exec-688": a DNS-1123 subdomain must 
> consist of lower case alphanumeric characters, '-' or '.', and must start and 
> end with an alphanumeric character (e.g. 'example.com', regex used for 
> validation is 
> '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'), 
> reason=FieldValueInvalid, additionalProperties={}), 
> StatusCause(field=spec.hostname, message=Invalid value: 
> "spark_exec-exec-688": a DNS-1123 label must consist of lower case 
> alphanumeric characters or '-', and must start and end with an alphanumeric 
> character (e.g. 'my-name',  or '123-abc', regex used for validation is 
> '[a-z0-9]([-a-z0-9]*[a-z0-9])?'), reason=FieldValueInvalid, 
> additionalProperties={})], group=null, kind=Pod, name=spark_exec-exec-688, 
> retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, 
> message=Pod "spark_exec-exec-688" is invalid: [metadata.name: Invalid value: 
> "spark_exec-exec-688": a DNS-1123 subdomain must consist of lower case 
> alphanumeric characters, '-' or '.', and must start and end with an 
> alphanumeric character (e.g. 'example.com', regex used for validation is 
> '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'), 
> spec.hostname: Invalid value: "spark_exec-exec-688": a DNS-1123 label must 
> consist of lower case alphanumeric characters or '-', and must start and end 
> with an alphanumeric character (e.g. 'my-name',  or '123-abc', regex used for 
> validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?')], 
> metadata=ListMeta(_continue=null, remainingItemCount=null, 
> resourceVersion=null, selfLink=null, additionalProperties={}), 
> reason=Invalid, status=Failure, additionalProperties={}).
>       at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:583)
>       at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:522)
>       at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:487)
>       at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:448)
>       at 
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:263)
>       at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:870)
>       at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:365)
>       at 
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:86)
> {code}
> When `spark.kubernetes.executor.podNamePrefix` contains invalid characters, 
> the driver will continuously fail to request executors from k8s master, which 
> causes the app to hang with the above message.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to