prakash gurung created SPARK-46128:
--------------------------------------

             Summary: External scheduler cannot be instantiated
                 Key: SPARK-46128
                 URL: https://issues.apache.org/jira/browse/SPARK-46128
             Project: Spark
          Issue Type: Bug
          Components: Kubernetes, Spark Core, Spark Submit
    Affects Versions: 3.5.0, 3.1.2
            Reporter: prakash gurung


Spark submit driver fails to resolve "kubernetes.default.svc" when trying to 
create executors.  

Spark versions tried:
 * 3.5.0
 * 3.1.2

Kubernetes cluster on premises using kubeadm
 * Kubernetes version: v1.28.2
 * OS: Ubuntu 22.04.1 (Jammy)
 * Container Runtime: 1.6.24

Complete error :
{code:java}
+ shift 1+ CMD=("$SPARK_HOME/bin/spark-submit" --conf 
"spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client 
"$@")+ exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf 
spark.driver.bindAddress=10.48.131.135 --deploy-mode client --properties-file 
/opt/spark/conf/spark.properties --class org.apache.spark.examples.SparkPi 
local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jarWARNING: An 
illegal reflective access operation has occurredWARNING: Illegal reflective 
access by org.apache.spark.unsafe.Platform 
(file:/opt/spark/jars/spark-unsafe_2.12-3.1.2.jar) to constructor 
java.nio.DirectByteBuffer(long,int)WARNING: Please consider reporting this to 
the maintainers of org.apache.spark.unsafe.PlatformWARNING: Use 
--illegal-access=warn to enable warnings of further illegal reflective access 
operationsWARNING: All illegal access operations will be denied in a future 
release23/11/22 03:27:20 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicableUsing 
Spark's default log4j profile: 
org/apache/spark/log4j-defaults.properties23/11/22 03:27:20 INFO SparkContext: 
Running Spark version 3.1.223/11/22 03:27:20 INFO ResourceUtils: 
==============================================================23/11/22 03:27:20 
INFO ResourceUtils: No custom resources configured for spark.driver.23/11/22 
03:27:20 INFO ResourceUtils: 
==============================================================23/11/22 03:27:20 
INFO SparkContext: Submitted application: Spark Pi23/11/22 03:27:20 INFO 
ResourceProfile: Default ResourceProfile created, executor resources: Map(cores 
-> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 
1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , 
vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)23/11/22 
03:27:20 INFO ResourceProfile: Limiting resource is cpus at 1 tasks per 
executor23/11/22 03:27:20 INFO ResourceProfileManager: Added ResourceProfile 
id: 023/11/22 03:27:20 INFO SecurityManager: Changing view acls to: 
185,root23/11/22 03:27:20 INFO SecurityManager: Changing modify acls to: 
185,root23/11/22 03:27:20 INFO SecurityManager: Changing view acls groups 
to:23/11/22 03:27:20 INFO SecurityManager: Changing modify acls groups 
to:23/11/22 03:27:20 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users  with view permissions: Set(185, root); 
groups with view permissions: Set(); users  with modify permissions: Set(185, 
root); groups with modify permissions: Set()23/11/22 03:27:20 INFO Utils: 
Successfully started service 'sparkDriver' on port 7078.23/11/22 03:27:20 INFO 
SparkEnv: Registering MapOutputTracker23/11/22 03:27:20 INFO SparkEnv: 
Registering BlockManagerMaster23/11/22 03:27:20 INFO 
BlockManagerMasterEndpoint: Using 
org.apache.spark.storage.DefaultTopologyMapper for getting topology 
information23/11/22 03:27:20 INFO BlockManagerMasterEndpoint: 
BlockManagerMasterEndpoint up23/11/22 03:27:20 INFO SparkEnv: Registering 
BlockManagerMasterHeartbeat23/11/22 03:27:20 INFO DiskBlockManager: Created 
local directory at 
/var/data/spark-9239c605-130e-4feb-b050-a33546d330bb/blockmgr-dd78ca51-ba55-4da9-82e3-6d4f17b6975323/11/22
 03:27:20 INFO MemoryStore: MemoryStore started with capacity 413.9 MiB23/11/22 
03:27:20 INFO SparkEnv: Registering OutputCommitCoordinator23/11/22 03:27:20 
INFO Utils: Successfully started service 'SparkUI' on port 4040.23/11/22 
03:27:20 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at 
http://spark-pi-d465538bf5115d50-driver-svc.default.svc:404023/11/22 03:27:20 
INFO SparkContext: Added JAR 
local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar at 
file:/opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar with timestamp 
170062364024623/11/22 03:27:20 WARN SparkContext: The jar 
local:///opt/spark/examples/jars/spark-examples_2.12-3.1.2.jar has been added 
already. Overwriting of added jars is not supported in the current 
version.23/11/22 03:27:20 INFO SparkKubernetesClientFactory: Auto-configuring 
K8S client using current context from users K8S config file23/11/22 03:27:41 
ERROR SparkContext: Error initializing 
SparkContext.org.apache.spark.SparkException: External scheduler cannot be 
instantiated    at 
org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2961)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:557)    at 
org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2672)    at 
org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:945)
    at scala.Option.getOrElse(Option.scala:189)    at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:939)   
 at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30)    at 
org.apache.spark.examples.SparkPi.main(SparkPi.scala)    at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  
  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown 
Source)    at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown 
Source)    at java.base/java.lang.reflect.Method.invoke(Unknown Source)    at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)    
at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)   
 at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)    at 
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)    at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)    
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)    at 
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]  for 
kind: [Pod]  with name: [spark-pi-d465538bf5115d50-driver]  in namespace: 
[default]  failed.    at 
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
    at 
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
    at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:225)
    at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:186) 
   at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:84)  
  at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:75)
    at scala.Option.map(Option.scala:230)    at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:74)
    at 
org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:123)
    at 
org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2955)
    ... 19 moreCaused by: java.net.UnknownHostException: 
kubernetes.default.svc: Temporary failure in name resolution    at 
java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)    at 
java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(Unknown 
Source)    at 
java.base/java.net.InetAddress.getAddressesFromNameService(Unknown Source)    
at java.base/java.net.InetAddress$NameServiceAddresses.get(Unknown Source)    
at java.base/java.net.InetAddress.getAllByName0(Unknown Source)    at 
java.base/java.net.InetAddress.getAllByName(Unknown Source)    at 
java.base/java.net.InetAddress.getAllByName(Unknown Source)    at 
okhttp3.Dns$1.lookup(Dns.java:40)    at 
okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:185)
    at 
okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:149)    
at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:84)    at 
okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:215)
    at 
okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135)
    at 
okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114)
    at 
okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at 
okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)    
at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at 
okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)    
at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:127)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at 
io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:135)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at 
io.fabric8.kubernetes.client.utils.OIDCTokenRefreshInterceptor.intercept(OIDCTokenRefreshInterceptor.java:41)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at 
io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at 
io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:151)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257)    
at okhttp3.RealCall.execute(RealCall.java:93)    at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:490)
    at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451)
    at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:416)
    at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:397)
    at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:933)
    at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:220)
    ... 26 more23/11/22 03:27:41 INFO SparkUI: Stopped Spark web UI at 
http://spark-pi-d465538bf5115d50-driver-svc.default.svc:404023/11/22 03:27:41 
INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint 
stopped!23/11/22 03:27:41 INFO MemoryStore: MemoryStore cleared23/11/22 
03:27:41 INFO BlockManager: BlockManager stopped23/11/22 03:27:41 INFO 
BlockManagerMaster: BlockManagerMaster stopped23/11/22 03:27:41 WARN 
MetricsSystem: Stopping a MetricsSystem that is not running23/11/22 03:27:41 
INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!23/11/22 03:27:41 INFO SparkContext: 
Successfully stopped SparkContextException in thread "main" 
org.apache.spark.SparkException: External scheduler cannot be instantiated    
at 
org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2961)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:557)    at 
org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2672)    at 
org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:945)
    at scala.Option.getOrElse(Option.scala:189)    at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:939)   
 at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30)    at 
org.apache.spark.examples.SparkPi.main(SparkPi.scala)    at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)  
  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown 
Source)    at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown 
Source)    at java.base/java.lang.reflect.Method.invoke(Unknown Source)    at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)    
at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)   
 at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)    at 
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)    at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)    
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)    at 
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)Caused by: 
io.fabric8.kubernetes.client.KubernetesClientException: Operation: [get]  for 
kind: [Pod]  with name: [spark-pi-d465538bf5115d50-driver]  in namespace: 
[default]  failed.    at 
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
    at 
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
    at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:225)
    at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:186) 
   at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:84)  
  at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$driverPod$1(ExecutorPodsAllocator.scala:75)
    at scala.Option.map(Option.scala:230)    at 
org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:74)
    at 
org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:123)
    at 
org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2955)
    ... 19 moreCaused by: java.net.UnknownHostException: 
kubernetes.default.svc: Temporary failure in name resolution    at 
java.base/java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)    at 
java.base/java.net.InetAddress$PlatformNameService.lookupAllHostAddr(Unknown 
Source)    at 
java.base/java.net.InetAddress.getAddressesFromNameService(Unknown Source)    
at java.base/java.net.InetAddress$NameServiceAddresses.get(Unknown Source)    
at java.base/java.net.InetAddress.getAllByName0(Unknown Source)    at 
java.base/java.net.InetAddress.getAllByName(Unknown Source)    at 
java.base/java.net.InetAddress.getAllByName(Unknown Source)    at 
okhttp3.Dns$1.lookup(Dns.java:40)    at 
okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:185)
    at 
okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:149)    
at okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:84)    at 
okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:215)
    at 
okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:135)
    at 
okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:114)
    at 
okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at 
okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)    
at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at 
okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)    
at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:127)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at 
io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:135)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at 
io.fabric8.kubernetes.client.utils.OIDCTokenRefreshInterceptor.intercept(OIDCTokenRefreshInterceptor.java:41)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at 
io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at 
io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:151)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
    at 
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
    at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257)    
at okhttp3.RealCall.execute(RealCall.java:93)    at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:490)
    at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451)
    at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:416)
    at 
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:397)
    at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:933)
    at 
io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:220)
    ... 26 more23/11/22 03:27:41 INFO ShutdownHookManager: Shutdown hook 
called23/11/22 03:27:41 INFO ShutdownHookManager: Deleting directory 
/tmp/spark-2f306210-bd49-47ad-a12b-db283e4ca6fd23/11/22 03:27:41 INFO 
ShutdownHookManager: Deleting directory 
/var/data/spark-9239c605-130e-4feb-b050-a33546d330bb/spark-8840557b-371c-413e-a29c-a1e8f2ec748a
 {code}
Similar issue: https://issues.apache.org/jira/browse/SPARK-29640



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to