[
https://issues.apache.org/jira/browse/SPARK-35460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun reassigned SPARK-35460:
-------------------------------------
Assignee: Kent Yao
> invalid `spark.kubernetes.executor.podNamePrefix` causes app to hang
> ---------------------------------------------------------------------
>
> Key: SPARK-35460
> URL: https://issues.apache.org/jira/browse/SPARK-35460
> Project: Spark
> Issue Type: Improvement
> Components: Kubernetes
> Affects Versions: 3.2.0
> Reporter: Kent Yao
> Assignee: Kent Yao
> Priority: Major
>
> {code:java}
> 21/05/20 21:41:21 WARN ExecutorPodsSnapshotsStoreImpl: Exception when
> notifying snapshot subscriber.
> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing:
> POST at:
> https://kubernetes.docker.internal:6443/api/v1/namespaces/default/pods.
> Message: Pod "spark_exec-exec-688" is invalid: [metadata.name: Invalid value:
> "spark_exec-exec-688": a DNS-1123 subdomain must consist of lower case
> alphanumeric characters, '-' or '.', and must start and end with an
> alphanumeric character (e.g. 'example.com', regex used for validation is
> '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'),
> spec.hostname: Invalid value: "spark_exec-exec-688": a DNS-1123 label must
> consist of lower case alphanumeric characters or '-', and must start and end
> with an alphanumeric character (e.g. 'my-name', or '123-abc', regex used for
> validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?')]. Received status:
> Status(apiVersion=v1, code=422,
> details=StatusDetails(causes=[StatusCause(field=metadata.name,
> message=Invalid value: "spark_exec-exec-688": a DNS-1123 subdomain must
> consist of lower case alphanumeric characters, '-' or '.', and must start and
> end with an alphanumeric character (e.g. 'example.com', regex used for
> validation is
> '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'),
> reason=FieldValueInvalid, additionalProperties={}),
> StatusCause(field=spec.hostname, message=Invalid value:
> "spark_exec-exec-688": a DNS-1123 label must consist of lower case
> alphanumeric characters or '-', and must start and end with an alphanumeric
> character (e.g. 'my-name', or '123-abc', regex used for validation is
> '[a-z0-9]([-a-z0-9]*[a-z0-9])?'), reason=FieldValueInvalid,
> additionalProperties={})], group=null, kind=Pod, name=spark_exec-exec-688,
> retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status,
> message=Pod "spark_exec-exec-688" is invalid: [metadata.name: Invalid value:
> "spark_exec-exec-688": a DNS-1123 subdomain must consist of lower case
> alphanumeric characters, '-' or '.', and must start and end with an
> alphanumeric character (e.g. 'example.com', regex used for validation is
> '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'),
> spec.hostname: Invalid value: "spark_exec-exec-688": a DNS-1123 label must
> consist of lower case alphanumeric characters or '-', and must start and end
> with an alphanumeric character (e.g. 'my-name', or '123-abc', regex used for
> validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?')],
> metadata=ListMeta(_continue=null, remainingItemCount=null,
> resourceVersion=null, selfLink=null, additionalProperties={}),
> reason=Invalid, status=Failure, additionalProperties={}).
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:583)
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:522)
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:487)
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:448)
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:263)
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:870)
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:365)
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:86)
> {code}
> When `spark.kubernetes.executor.podNamePrefix` contains invalid characters,
> the driver will continuously fail to request executors from k8s master, which
> causes the app to hang with the above message.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]