[
https://issues.apache.org/jira/browse/SPARK-35284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17374752#comment-17374752
]
Leona Yoda commented on SPARK-35284:
------------------------------------
Once I experienced this kind of error (in my case Kinesis DStream so not the
same), and at that time kubernetes-client version cause the error.
[https://github.com/fabric8io/kubernetes-client#kubernetes-compatibility-matrix]
Spark 3.0.3 uses v4.9.2 and it does not support K8s 1.19.
> Kubernetes Fabric exception with Scala programs in Spark 3.x
> ------------------------------------------------------------
>
> Key: SPARK-35284
> URL: https://issues.apache.org/jira/browse/SPARK-35284
> Project: Spark
> Issue Type: Bug
> Components: Kubernetes
> Affects Versions: 3.0.0
> Environment: Docker Desktop v 3.2 on Windows 10. Kubernetes v1.19.7.
> Apps are launched with the latest Spark Operator. Kafka with Confluent
> Platform 6.0
>
> Reporter: Ketan Doshi
> Priority: Major
>
> Exception occurs when running a small Scala app on Spark 3.x on Kubernetes.
> Python programs work fine. The applications are launched using Spark Operator.
> The app uses Spark Structured Streams and reads and writes JSON data to a
> Kafka topic. This happens during development so only 5-10 small records are
> being written, and the app doesn't run for more than 3-4 minutes.
> This error is somewhat unpredictable but results in different failure
> scenarios making Scala apps very unstable.
> eg. Kafka read succeeds but Kafka write fails
> eg. writes to Console or Memory don't work at all - no output is produced.
> eg. Read from file stream and write to Kafka usually works
> 21/04/30 10:24:13 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0,
> 10.1.14.118, executor 1, partition 0, PROCESS_LOCAL, 8414 bytes)
> 21/04/30 10:24:19 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory
> on 10.1.14.118:43469 (size: 9.5 KiB, free: 117.0 MiB)
> 21/04/30 10:24:53 ERROR Utils: Uncaught exception in thread
> kubernetes-executor-pod-polling-sync
> io.fabric8.kubernetes.client.KubernetesClientException: Operation: [list] for
> kind: [Pod] with name: [null] in namespace: [spark-app] failed.
> at
> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
> at
> io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.listRequestHelper(BaseOperation.java:155)
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:621)
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.list(BaseOperation.java:70)
> at
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsPollingSnapshotSource$PollRunnable.$anonfun$run$1(ExecutorPodsPollingSnapshotSource.scala:61)
> at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1357)
> at
> org.apache.spark.scheduler.cluster.k8s.ExecutorPodsPollingSnapshotSource$PollRunnable.run(ExecutorPodsPollingSnapshotSource.scala:56)
> at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown
> Source)
> at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
> at
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
> Source)
> at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
> at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
> Source)
> at java.base/java.lang.Thread.run(Unknown Source)
> Caused by: java.net.SocketTimeoutException: timeout
> at okio.Okio$4.newTimeoutException(Okio.java:232)
> at okio.AsyncTimeout.exit(AsyncTimeout.java:285)
> at okio.AsyncTimeout$2.read(AsyncTimeout.java:241)
> at okio.RealBufferedSource.indexOf(RealBufferedSource.java:354)
> at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:226)
> at okhttp3.internal.http1.Http1Codec.readHeaderLine(Http1Codec.java:215)
> at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:189)
> at
> okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:88)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
> at
> okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
> at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
> at
> okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
> at
> okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:127)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
> at
> io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:134)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
> at
> io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
> at
> io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:109)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
> at
> okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
> at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257)
> at okhttp3.RealCall.execute(RealCall.java:93)
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:469)
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)
> at
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:412)
> at
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.listRequestHelper(BaseOperation.java:151)
> ... 11 more
> Caused by: java.net.SocketTimeoutException: Read timed out
> at java.base/java.net.SocketInputStream.socketRead0(Native Method)
> at java.base/java.net.SocketInputStream.socketRead(Unknown Source)
> at java.base/java.net.SocketInputStream.read(Unknown Source)
> at java.base/java.net.SocketInputStream.read(Unknown Source)
> at java.base/sun.security.ssl.SSLSocketInputRecord.read(Unknown Source)
> at java.base/sun.security.ssl.SSLSocketInputRecord.readHeader(Unknown Source)
> at
> java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(Unknown
> Source)
> at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(Unknown
> Source)
> at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(Unknown
> Source)
> at okio.Okio$2.read(Okio.java:140)
> at okio.AsyncTimeout$2.read(AsyncTimeout.java:237)
> ... 43 more
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]