[ https://issues.apache.org/jira/browse/SPARK-46128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
prakash gurung updated SPARK-46128: ----------------------------------- Description: Spark submit driver fails to resolve "kubernetes.default.svc" when trying to create executors on newly added worker nodes, however there are no issue with the existing worker nodes. Spark versions tried: * 3.5.0 * 3.1.2 Kubernetes cluster on premises using kubeadm * Kubernetes version: v1.28.2 * OS: Ubuntu 22.04.1 (Jammy) * Container Runtime: 1.6.24 Complete error : {noformat} + CMD=("$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@") + exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=192.168.95.23 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/opt/spark/jars/spark-unsafe_2.12-3.1.2.jar) to constructor java.nio.DirectByteBuffer(long,int) WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release 23/11/28 01:20:03 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 23/11/28 01:20:03 INFO SparkContext: Running Spark version 3.1.2 23/11/28 01:20:03 INFO ResourceUtils: ============================================================== 23/11/28 01:20:03 INFO ResourceUtils: No custom resources configured for spark.driver. 23/11/28 01:20:03 INFO ResourceUtils: ============================================================== 23/11/28 01:20:03 INFO SparkContext: Submitted application: Spark Pi 23/11/28 01:20:03 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0) 23/11/28 01:20:03 INFO ResourceProfile: Limiting resource is cpus at 1 tasks per executor 23/11/28 01:20:03 INFO ResourceProfileManager: Added ResourceProfile id: 0 23/11/28 01:20:03 INFO SecurityManager: Changing view acls to: 185,root 23/11/28 01:20:03 INFO SecurityManager: Changing modify acls to: 185,root 23/11/28 01:20:03 INFO SecurityManager: Changing view acls groups to: 23/11/28 01:20:03 INFO SecurityManager: Changing modify acls groups to: 23/11/28 01:20:03 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(185, root); groups with view permissions: Set(); users with modify permissions: Set(185, root); groups with modify permissions: Set() 23/11/28 01:20:04 INFO Utils: Successfully started service 'sparkDriver' on port 7078. 23/11/28 01:20:04 INFO SparkEnv: Registering MapOutputTracker 23/11/28 01:20:04 INFO SparkEnv: Registering BlockManagerMaster 23/11/28 01:20:04 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 23/11/28 01:20:04 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 23/11/28 01:20:04 INFO SparkEnv: Registering BlockManagerMasterHeartbeat 23/11/28 01:20:04 INFO DiskBlockManager: Created local directory at /var/data/spark-f0634fda-1366-4da1-8ac2-262e4bf9952b/blockmgr-7ac2193b-f7ad-4bc2-bdfa-386d2d3f4bf6 23/11/28 01:20:04 INFO MemoryStore: MemoryStore started with capacity 413.9 MiB 23/11/28 01:20:04 INFO SparkEnv: Registering OutputCommitCoordinator 23/11/28 01:20:04 INFO Utils: Successfully started service 'SparkUI' on port 4040. 23/11/28 01:20:04 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://spark-pi-96895e8c1382ff30-driver-svc.default.svc:4040 23/11/28 01:20:04 INFO SparkContext: Added JAR local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar at file:/opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar with timestamp 1701134403914 23/11/28 01:20:04 WARN SparkContext: The jar local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar has been added already. Overwriting of added jars is not supported in the current version. 23/11/28 01:20:04 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file 23/11/28 01:20:24 ERROR SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: External scheduler cannot be instantiated at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2961) at org.apache.spark.SparkContext.<init>(SparkContext.scala:557) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2672) at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:945) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:939) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.base/java.lang.reflect.Method.invoke(Unknown Source) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [spark-pi-96895e8c1382ff30-driver] in namespace: [default] failed. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:225) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:186) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:84) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:75) at scala.Option.map(Option.scala:230) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:74) at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:123) at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2955) ... 19 more Caused by: java.net.UnknownHostException: kubernetes.default.svc: Temporary failure in name resolution at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(Unknown Source) at java.base/java.net.InetAddress.getAddressesFromNameService(Unknown Source) at java.base/java.net.InetAddress$NameServiceAddresses.get(Unknown Source) at java.base/java.net.InetAddress.getAllByName0(Unknown Source) at java.base/java.net.InetAddress.getAllByName(Unknown Source) at java.base/java.net.InetAddress.getAllByName(Unknown Source) at okhttp3.Dns$1.lookup(Dns.java:40) at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:185) at okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:149) at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:84) at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:215) at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135) at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:127) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:135) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.OIDCTokenRefreshInterceptor.intercept(OIDCTokenRefreshInterceptor.java:41) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:151) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257) at okhttp3.RealCall.execute(RealCall.java:93) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:490) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:416) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:397) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:933) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:220) ... 26 more 23/11/28 01:20:24 INFO SparkUI: Stopped Spark web UI at http://spark-pi-96895e8c1382ff30-driver-svc.default.svc:4040 23/11/28 01:20:24 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 23/11/28 01:20:24 INFO MemoryStore: MemoryStore cleared 23/11/28 01:20:24 INFO BlockManager: BlockManager stopped 23/11/28 01:20:24 INFO BlockManagerMaster: BlockManagerMaster stopped 23/11/28 01:20:24 WARN MetricsSystem: Stopping a MetricsSystem that is not running 23/11/28 01:20:24 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 23/11/28 01:20:24 INFO SparkContext: Successfully stopped SparkContext Exception in thread "main" org.apache.spark.SparkException: External scheduler cannot be instantiated at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2961) at org.apache.spark.SparkContext.<init>(SparkContext.scala:557) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2672) at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:945) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:939) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.base/java.lang.reflect.Method.invoke(Unknown Source) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [spark-pi-96895e8c1382ff30-driver] in namespace: [default] failed. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:225) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:186) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:84) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:75) at scala.Option.map(Option.scala:230) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:74) at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:123) at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2955) ... 19 more Caused by: java.net.UnknownHostException: kubernetes.default.svc: Temporary failure in name resolution at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(Unknown Source) at java.base/java.net.InetAddress.getAddressesFromNameService(Unknown Source) at java.base/java.net.InetAddress$NameServiceAddresses.get(Unknown Source) at java.base/java.net.InetAddress.getAllByName0(Unknown Source) at java.base/java.net.InetAddress.getAllByName(Unknown Source) at java.base/java.net.InetAddress.getAllByName(Unknown Source) at okhttp3.Dns$1.lookup(Dns.java:40) at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:185) at okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:149) at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:84) at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:215) at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135) at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:127) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:135) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.OIDCTokenRefreshInterceptor.intercept(OIDCTokenRefreshInterceptor.java:41) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:151) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257) at okhttp3.RealCall.execute(RealCall.java:93) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:490) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:416) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:397) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:933) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:220) ... 26 more 23/11/28 01:20:24 INFO ShutdownHookManager: Shutdown hook called 23/11/28 01:20:24 INFO ShutdownHookManager: Deleting directory /var/data/spark-f0634fda-1366-4da1-8ac2-262e4bf9952b/spark-0190347a-61ed-45b3-bddc-d0a92db7bcc8 23/11/28 01:20:24 INFO ShutdownHookManager: Deleting directory /tmp/spark-7ca3d253-94b6-442c-b557-f4270c3d12ce{noformat} Similar issue: https://issues.apache.org/jira/browse/SPARK-29640 was: Spark submit driver fails to resolve "kubernetes.default.svc" when trying to create executors. Spark versions tried: * 3.5.0 * 3.1.2 Kubernetes cluster on premises using kubeadm * Kubernetes version: v1.28.2 * OS: Ubuntu 22.04.1 (Jammy) * Container Runtime: 1.6.24 Complete error : {code:java} + shift 1+ CMD=("$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@")+ exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=10.48.131.135 --deploy-mode client --properties-file /opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jarWARNING: An illegal reflective access operation has occurredWARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/opt/spark/jars/spark-unsafe_2.12-3.1.2.jar) to constructor java.nio.DirectByteBuffer(long,int)WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.PlatformWARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operationsWARNING: All illegal access operations will be denied in a future release23/11/22 03:27:20 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableUsing Spark's default log4j profile: org/apache/spark/log4j-defaults.properties23/11/22 03:27:20 INFO SparkContext: Running Spark version 3.1.223/11/22 03:27:20 INFO ResourceUtils: ==============================================================23/11/22 03:27:20 INFO ResourceUtils: No custom resources configured for spark.driver.23/11/22 03:27:20 INFO ResourceUtils: ==============================================================23/11/22 03:27:20 INFO SparkContext: Submitted application: Spark Pi23/11/22 03:27:20 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)23/11/22 03:27:20 INFO ResourceProfile: Limiting resource is cpus at 1 tasks per executor23/11/22 03:27:20 INFO ResourceProfileManager: Added ResourceProfile id: 023/11/22 03:27:20 INFO SecurityManager: Changing view acls to: 185,root23/11/22 03:27:20 INFO SecurityManager: Changing modify acls to: 185,root23/11/22 03:27:20 INFO SecurityManager: Changing view acls groups to:23/11/22 03:27:20 INFO SecurityManager: Changing modify acls groups to:23/11/22 03:27:20 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(185, root); groups with view permissions: Set(); users with modify permissions: Set(185, root); groups with modify permissions: Set()23/11/22 03:27:20 INFO Utils: Successfully started service 'sparkDriver' on port 7078.23/11/22 03:27:20 INFO SparkEnv: Registering MapOutputTracker23/11/22 03:27:20 INFO SparkEnv: Registering BlockManagerMaster23/11/22 03:27:20 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information23/11/22 03:27:20 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up23/11/22 03:27:20 INFO SparkEnv: Registering BlockManagerMasterHeartbeat23/11/22 03:27:20 INFO DiskBlockManager: Created local directory at /var/data/spark-9239c605-130e-4feb-b050-a33546d330bb/blockmgr-dd78ca51-ba55-4da9-82e3-6d4f17b6975323/11/22 03:27:20 INFO MemoryStore: MemoryStore started with capacity 413.9 MiB23/11/22 03:27:20 INFO SparkEnv: Registering OutputCommitCoordinator23/11/22 03:27:20 INFO Utils: Successfully started service 'SparkUI' on port 4040.23/11/22 03:27:20 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://spark-pi-d465538bf5115d50-driver-svc.default.svc:404023/11/22 03:27:20 INFO SparkContext: Added JAR local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar at file:/opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar with timestamp 170062364024623/11/22 03:27:20 WARN SparkContext: The jar local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar has been added already. Overwriting of added jars is not supported in the current version.23/11/22 03:27:20 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file23/11/22 03:27:41 ERROR SparkContext: Error initializing SparkContext.org.apache.spark.SparkException: External scheduler cannot be instantiated at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2961) at org.apache.spark.SparkContext.<init>(SparkContext.scala:557) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2672) at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:945) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:939) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.base/java.lang.reflect.Method.invoke(Unknown Source) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [spark-pi-d465538bf5115d50-driver] in namespace: [default] failed. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:225) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:186) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:84) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:75) at scala.Option.map(Option.scala:230) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:74) at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:123) at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2955) ... 19 moreCaused by: java.net.UnknownHostException: kubernetes.default.svc: Temporary failure in name resolution at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(Unknown Source) at java.base/java.net.InetAddress.getAddressesFromNameService(Unknown Source) at java.base/java.net.InetAddress$NameServiceAddresses.get(Unknown Source) at java.base/java.net.InetAddress.getAllByName0(Unknown Source) at java.base/java.net.InetAddress.getAllByName(Unknown Source) at java.base/java.net.InetAddress.getAllByName(Unknown Source) at okhttp3.Dns$1.lookup(Dns.java:40) at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:185) at okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:149) at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:84) at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:215) at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135) at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:127) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:135) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.OIDCTokenRefreshInterceptor.intercept(OIDCTokenRefreshInterceptor.java:41) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:151) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257) at okhttp3.RealCall.execute(RealCall.java:93) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:490) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:416) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:397) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:933) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:220) ... 26 more23/11/22 03:27:41 INFO SparkUI: Stopped Spark web UI at http://spark-pi-d465538bf5115d50-driver-svc.default.svc:404023/11/22 03:27:41 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!23/11/22 03:27:41 INFO MemoryStore: MemoryStore cleared23/11/22 03:27:41 INFO BlockManager: BlockManager stopped23/11/22 03:27:41 INFO BlockManagerMaster: BlockManagerMaster stopped23/11/22 03:27:41 WARN MetricsSystem: Stopping a MetricsSystem that is not running23/11/22 03:27:41 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!23/11/22 03:27:41 INFO SparkContext: Successfully stopped SparkContextException in thread "main" org.apache.spark.SparkException: External scheduler cannot be instantiated at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2961) at org.apache.spark.SparkContext.<init>(SparkContext.scala:557) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2672) at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:945) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:939) at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30) at org.apache.spark.examples.SparkPi.main(SparkPi.scala) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.base/java.lang.reflect.Method.invoke(Unknown Source) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get] for kind: [Pod] with name: [spark-pi-d465538bf5115d50-driver] in namespace: [default] failed. at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:225) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:186) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:84) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:75) at scala.Option.map(Option.scala:230) at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:74) at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:123) at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2955) ... 19 moreCaused by: java.net.UnknownHostException: kubernetes.default.svc: Temporary failure in name resolution at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) at java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(Unknown Source) at java.base/java.net.InetAddress.getAddressesFromNameService(Unknown Source) at java.base/java.net.InetAddress$NameServiceAddresses.get(Unknown Source) at java.base/java.net.InetAddress.getAllByName0(Unknown Source) at java.base/java.net.InetAddress.getAllByName(Unknown Source) at java.base/java.net.InetAddress.getAllByName(Unknown Source) at okhttp3.Dns$1.lookup(Dns.java:40) at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:185) at okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:149) at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:84) at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:215) at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135) at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:127) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:135) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.OIDCTokenRefreshInterceptor.intercept(OIDCTokenRefreshInterceptor.java:41) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:151) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257) at okhttp3.RealCall.execute(RealCall.java:93) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:490) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:416) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:397) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:933) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:220) ... 26 more23/11/22 03:27:41 INFO ShutdownHookManager: Shutdown hook called23/11/22 03:27:41 INFO ShutdownHookManager: Deleting directory /tmp/spark-2f306210-bd49-47ad-a12b-db283e4ca6fd23/11/22 03:27:41 INFO ShutdownHookManager: Deleting directory /var/data/spark-9239c605-130e-4feb-b050-a33546d330bb/spark-8840557b-371c-413e-a29c-a1e8f2ec748a {code} Similar issue: https://issues.apache.org/jira/browse/SPARK-29640 > External scheduler cannot be instantiated > ----------------------------------------- > > Key: SPARK-46128 > URL: https://issues.apache.org/jira/browse/SPARK-46128 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit > Affects Versions: 3.1.2, 3.5.0 > Reporter: prakash gurung > Priority: Major > > Spark submit driver fails to resolve "kubernetes.default.svc" when trying to > create executors on newly added worker nodes, however there are no issue with > the existing worker nodes. > Spark versions tried: > * 3.5.0 > * 3.1.2 > Kubernetes cluster on premises using kubeadm > * Kubernetes version: v1.28.2 > * OS: Ubuntu 22.04.1 (Jammy) > * Container Runtime: 1.6.24 > Complete error : > {noformat} > + CMD=("$SPARK_HOME/bin/spark-submit" --conf > "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client > "$@") > + exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf > spark.driver.bindAddress=192.168.95.23 --deploy-mode client --properties-file > /opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi > local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar > WARNING: An illegal reflective access operation has occurred > WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform > (file:/opt/spark/jars/spark-unsafe_2.12-3.1.2.jar) to constructor > java.nio.DirectByteBuffer(long,int) > WARNING: Please consider reporting this to the maintainers of > org.apache.spark.unsafe.Platform > WARNING: Use --illegal-access=warn to enable warnings of further illegal > reflective access operations > WARNING: All illegal access operations will be denied in a future release > 23/11/28 01:20:03 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 23/11/28 01:20:03 INFO SparkContext: Running Spark version 3.1.2 > 23/11/28 01:20:03 INFO ResourceUtils: > ============================================================== > 23/11/28 01:20:03 INFO ResourceUtils: No custom resources configured for > spark.driver. > 23/11/28 01:20:03 INFO ResourceUtils: > ============================================================== > 23/11/28 01:20:03 INFO SparkContext: Submitted application: Spark Pi > 23/11/28 01:20:03 INFO ResourceProfile: Default ResourceProfile created, > executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , > memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: > offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: > cpus, amount: 1.0) > 23/11/28 01:20:03 INFO ResourceProfile: Limiting resource is cpus at 1 tasks > per executor > 23/11/28 01:20:03 INFO ResourceProfileManager: Added ResourceProfile id: 0 > 23/11/28 01:20:03 INFO SecurityManager: Changing view acls to: 185,root > 23/11/28 01:20:03 INFO SecurityManager: Changing modify acls to: 185,root > 23/11/28 01:20:03 INFO SecurityManager: Changing view acls groups to: > 23/11/28 01:20:03 INFO SecurityManager: Changing modify acls groups to: > 23/11/28 01:20:03 INFO SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(185, root); > groups with view permissions: Set(); users with modify permissions: Set(185, > root); groups with modify permissions: Set() > 23/11/28 01:20:04 INFO Utils: Successfully started service 'sparkDriver' on > port 7078. > 23/11/28 01:20:04 INFO SparkEnv: Registering MapOutputTracker > 23/11/28 01:20:04 INFO SparkEnv: Registering BlockManagerMaster > 23/11/28 01:20:04 INFO BlockManagerMasterEndpoint: Using > org.apache.spark.storage.DefaultTopologyMapper for getting topology > information > 23/11/28 01:20:04 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint > up > 23/11/28 01:20:04 INFO SparkEnv: Registering BlockManagerMasterHeartbeat > 23/11/28 01:20:04 INFO DiskBlockManager: Created local directory at > /var/data/spark-f0634fda-1366-4da1-8ac2-262e4bf9952b/blockmgr-7ac2193b-f7ad-4bc2-bdfa-386d2d3f4bf6 > 23/11/28 01:20:04 INFO MemoryStore: MemoryStore started with capacity 413.9 > MiB > 23/11/28 01:20:04 INFO SparkEnv: Registering OutputCommitCoordinator > 23/11/28 01:20:04 INFO Utils: Successfully started service 'SparkUI' on port > 4040. > 23/11/28 01:20:04 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at > http://spark-pi-96895e8c1382ff30-driver-svc.default.svc:4040 > 23/11/28 01:20:04 INFO SparkContext: Added JAR > local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar at > file:/opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar with timestamp > 1701134403914 > 23/11/28 01:20:04 WARN SparkContext: The jar > local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar has been added > already. Overwriting of added jars is not supported in the current version. > 23/11/28 01:20:04 INFO SparkKubernetesClientFactory: Auto-configuring K8S > client using current context from users K8S config file > 23/11/28 01:20:24 ERROR SparkContext: Error initializing SparkContext. > org.apache.spark.SparkException: External scheduler cannot be instantiated > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2961) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:557) > at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2672) > at > org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:945) > at scala.Option.getOrElse(Option.scala:189) > at > org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:939) > at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30) > at org.apache.spark.examples.SparkPi.main(SparkPi.scala) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown > Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown > Source) > at java.base/java.lang.reflect.Method.invoke(Unknown Source) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: > [get] for kind: [Pod] with name: [spark-pi-96895e8c1382ff30-driver] in > namespace: [default] failed. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:225) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:186) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:84) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:75) > at scala.Option.map(Option.scala:230) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:74) > at > org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:123) > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2955) > ... 19 more > Caused by: java.net.UnknownHostException: kubernetes.default.svc: Temporary > failure in name resolution > at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) > at > java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(Unknown > Source) > at java.base/java.net.InetAddress.getAddressesFromNameService(Unknown > Source) > at java.base/java.net.InetAddress$NameServiceAddresses.get(Unknown Source) > at java.base/java.net.InetAddress.getAllByName0(Unknown Source) > at java.base/java.net.InetAddress.getAllByName(Unknown Source) > at java.base/java.net.InetAddress.getAllByName(Unknown Source) > at okhttp3.Dns$1.lookup(Dns.java:40) > at > okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:185) > at > okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:149) > at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:84) > at > okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:215) > at > okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135) > at > okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114) > at > okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at > okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at > okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:127) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at > io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:135) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at > io.fabric8.kubernetes.client.utils.OIDCTokenRefreshInterceptor.intercept(OIDCTokenRefreshInterceptor.java:41) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at > io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at > io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:151) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257) > at okhttp3.RealCall.execute(RealCall.java:93) > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:490) > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451) > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:416) > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:397) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:933) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:220) > ... 26 more > 23/11/28 01:20:24 INFO SparkUI: Stopped Spark web UI at > http://spark-pi-96895e8c1382ff30-driver-svc.default.svc:4040 > 23/11/28 01:20:24 INFO MapOutputTrackerMasterEndpoint: > MapOutputTrackerMasterEndpoint stopped! > 23/11/28 01:20:24 INFO MemoryStore: MemoryStore cleared > 23/11/28 01:20:24 INFO BlockManager: BlockManager stopped > 23/11/28 01:20:24 INFO BlockManagerMaster: BlockManagerMaster stopped > 23/11/28 01:20:24 WARN MetricsSystem: Stopping a MetricsSystem that is not > running > 23/11/28 01:20:24 INFO > OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: > OutputCommitCoordinator stopped! > 23/11/28 01:20:24 INFO SparkContext: Successfully stopped SparkContext > Exception in thread "main" org.apache.spark.SparkException: External > scheduler cannot be instantiated > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2961) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:557) > at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2672) > at > org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:945) > at scala.Option.getOrElse(Option.scala:189) > at > org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:939) > at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30) > at org.apache.spark.examples.SparkPi.main(SparkPi.scala) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown > Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown > Source) > at java.base/java.lang.reflect.Method.invoke(Unknown Source) > at > org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) > at > org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951) > at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) > at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) > at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) > at > org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: > [get] for kind: [Pod] with name: [spark-pi-96895e8c1382ff30-driver] in > namespace: [default] failed. > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) > at > io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:225) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:186) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:84) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:75) > at scala.Option.map(Option.scala:230) > at > org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:74) > at > org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:123) > at > org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2955) > ... 19 more > Caused by: java.net.UnknownHostException: kubernetes.default.svc: Temporary > failure in name resolution > at java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) > at > java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(Unknown > Source) > at java.base/java.net.InetAddress.getAddressesFromNameService(Unknown > Source) > at java.base/java.net.InetAddress$NameServiceAddresses.get(Unknown Source) > at java.base/java.net.InetAddress.getAllByName0(Unknown Source) > at java.base/java.net.InetAddress.getAllByName(Unknown Source) > at java.base/java.net.InetAddress.getAllByName(Unknown Source) > at okhttp3.Dns$1.lookup(Dns.java:40) > at > okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:185) > at > okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:149) > at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:84) > at > okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:215) > at > okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135) > at > okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114) > at > okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at > okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at > okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:127) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at > io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:135) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at > io.fabric8.kubernetes.client.utils.OIDCTokenRefreshInterceptor.intercept(OIDCTokenRefreshInterceptor.java:41) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at > io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at > io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:151) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) > at > okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) > at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257) > at okhttp3.RealCall.execute(RealCall.java:93) > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:490) > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451) > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:416) > at > io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:397) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:933) > at > io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:220) > ... 26 more > 23/11/28 01:20:24 INFO ShutdownHookManager: Shutdown hook called > 23/11/28 01:20:24 INFO ShutdownHookManager: Deleting directory > /var/data/spark-f0634fda-1366-4da1-8ac2-262e4bf9952b/spark-0190347a-61ed-45b3-bddc-d0a92db7bcc8 > 23/11/28 01:20:24 INFO ShutdownHookManager: Deleting directory > /tmp/spark-7ca3d253-94b6-442c-b557-f4270c3d12ce{noformat} > > Similar issue: https://issues.apache.org/jira/browse/SPARK-29640 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org