It sounds like that your Kyuubi Server can not access your K8s Pod IP(where Spark driver lives), if so, this is mostly a network infrastructure issue.
Thanks, Cheng Pan > On Dec 20, 2024, at 01:34, Aaron Grubb <aa...@kaden.ai> wrote: > > Hello, > > I'm trying to set up Kyuubi as a JDBC gateway to Spark on Kubernetes with > cluster deploy mode. I'm at the point where the driver is running > in Kubernetes and the engine is loaded - relevant logs from the driver here: > > ------------ > > 24/12/19 16:51:36 INFO SparkContext: Added JAR > file:/tmp/spark-f15bf2f2-1f9a-4cdd-b7b3-9bf1bd228454/kyuubi-spark-sql-engine_2.12-1.10.0.jar > at > spark://spark-e8895193dfd364d7-driver-svc.spark-dev.svc:7078/jars/kyuubi-spark-sql-engine_2.12-1.10.0.jar > with timestamp 1734627095298 > ... > 24/12/19 16:53:37 INFO > KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Registered > executor NettyRpcEndpointRef(spark- > client://Executor) (10.3.23.80:41846) with ID 1, ResourceProfileId 0 > 24/12/19 16:53:37 INFO ExecutorMonitor: New executor 1 has registered (new > total is 1) > 24/12/19 16:53:37 INFO BlockManagerMasterEndpoint: Registering block manager > 10.3.23.80:44715 with 5.9 GiB RAM, BlockManagerId(1, 10.3.23.80, > 44715, None) > 24/12/19 16:53:37 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0) > (10.3.23.80, executor 1, partition 0, PROCESS_LOCAL, 9164 > bytes) > 24/12/19 16:53:37 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory > on 10.3.23.80:44715 (size: 3.6 KiB, free: 5.9 GiB) > 24/12/19 16:53:38 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) > in 925 ms on 10.3.23.80 (executor 1) (1/1) > 24/12/19 16:53:38 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > have all completed, from pool > 24/12/19 16:53:38 INFO DAGScheduler: ResultStage 0 (isEmpty at > KyuubiSparkUtil.scala:51) finished in 106.477 s > 24/12/19 16:53:38 INFO DAGScheduler: Job 0 is finished. Cancelling potential > speculative or zombie tasks for this job > 24/12/19 16:53:38 INFO TaskSchedulerImpl: Killing all running tasks in stage > 0: Stage finished > 24/12/19 16:53:38 INFO DAGScheduler: Job 0 finished: isEmpty at > KyuubiSparkUtil.scala:51, took 106.600060 s > 24/12/19 16:53:38 INFO ThreadUtils: SparkSQLSessionManager-exec-pool: pool > size: 100, wait queue size: 100, thread keepalive time: 60000 ms > 24/12/19 16:53:38 INFO SparkSQLOperationManager: > Service[SparkSQLOperationManager] is initialized. > 24/12/19 16:53:38 INFO SparkSQLSessionManager: > Service[SparkSQLSessionManager] is initialized. > 24/12/19 16:53:38 INFO SparkSQLBackendService: > Service[SparkSQLBackendService] is initialized. > 24/12/19 16:53:38 INFO SparkTBinaryFrontendService: Initializing > SparkTBinaryFrontend on kyuubi-user-spark-sql-anonymous-default-0a7bfe5b- > 3506-433d-8e11:37613 with [9, 999] worker threads > ... > 24/12/19 16:53:38 INFO ClientCnxn: Opening socket connection to server > 10.3.136.134/10.3.136.134:2181. Will not attempt to authenticate using > SASL (unknown error) > 24/12/19 16:53:38 INFO EngineServiceDiscovery: > Service[EngineServiceDiscovery] is initialized. > 24/12/19 16:53:38 INFO SparkTBinaryFrontendService: > Service[SparkTBinaryFrontend] is initialized. > 24/12/19 16:53:38 INFO SparkSQLEngine: Service[SparkSQLEngine] is initialized. > 24/12/19 16:53:38 INFO ClientCnxn: Socket connection established to > 10.3.136.134/10.3.136.134:2181, initiating session > 24/12/19 16:53:38 INFO ClientCnxn: Session establishment complete on server > 10.3.136.134/10.3.136.134:2181, sessionid = 0x10000005d9319dc, > negotiated timeout = 40000 > 24/12/19 16:53:38 INFO ConnectionStateManager: State change: CONNECTED > 24/12/19 16:53:38 INFO ZookeeperDiscoveryClient: Zookeeper client connection > state changed to: CONNECTED > 24/12/19 16:53:38 INFO SparkSQLOperationManager: > Service[SparkSQLOperationManager] is started. > 24/12/19 16:53:38 INFO SparkSQLSessionManager: > Service[SparkSQLSessionManager] is started. > 24/12/19 16:53:38 INFO SparkSQLBackendService: > Service[SparkSQLBackendService] is started. > 24/12/19 16:53:39 INFO ZookeeperDiscoveryClient: Created a > /kyuubi_1.10.0_USER_SPARK_SQL/anonymous/default/serverUri=10.3.173.201:37613;version=1.10.0;kyuubi.engine.appMgrInfo=eyJyZXNvdXJjZU1hbmFnZXIi > OiJrOHM6Ly9odHRwczovLzI1ODY1MzZDM0FBQzNGNEY4NTFCRkNCMUQ4QUQzODNDLmdyNy51cy1lYXN0LTEuZWtzLmFtYXpvbmF3cy5jb20iLCJrdWJlcm5ldGVzSW5mbyI6eyJjb250Z > Xh0IjpudWxsLCJuYW1lc3BhY2UiOiJzcGFyay1kZXYifX0=;kyuubi.engine.id=spark-af129813d4f5458590fc46f19f754643;kyuubi.engine.url=spark- > e8895193dfd364d7-driver-svc.spark-dev.svc:4040;spark.driver.memory=5140m;spark.executor.memory=10350m;refId=0a7bfe5b-3506-433d-8e11- > 0db01f269f23;sequence=0000000000 on ZooKeeper for KyuubiServer uri: > 10.3.173.201:37613 > 24/12/19 16:53:39 INFO EngineServiceDiscovery: Registered > EngineServiceDiscovery in namespace > /kyuubi_1.10.0_USER_SPARK_SQL/anonymous/default. > 24/12/19 16:53:39 INFO EngineServiceDiscovery: > Service[EngineServiceDiscovery] is started. > 24/12/19 16:53:39 INFO SparkTBinaryFrontendService: > Service[SparkTBinaryFrontend] is started. > 24/12/19 16:53:39 INFO SparkSQLEngine: Service[SparkSQLEngine] is started. > 24/12/19 16:53:39 INFO BlockManagerInfo: Removed broadcast_0_piece0 on > 10.3.23.80:44715 in memory (size: 3.6 KiB, free: 5.9 GiB) > 24/12/19 16:53:39 INFO BlockManagerInfo: Removed broadcast_0_piece0 on > spark-e8895193dfd364d7-driver-svc.spark-dev.svc:7079 in memory (size: > 3.6 KiB, free: 2.7 GiB) > 24/12/19 16:53:40 INFO SparkSQLEngine: > Spark application name: > kyuubi_USER_SPARK_SQL_anonymous_default_0a7bfe5b-3506-433d-8e11-0db01f269f23 > application ID: spark-af129813d4f5458590fc46f19f754643 > application tags: > application web UI: > http://spark-e8895193dfd364d7-driver-svc.spark-dev.svc:4040 > master: XXX > version: 3.5.3 > driver: [cpu: 1, mem: 5140m] > executor: [cpu: 4, mem: 10350m, maxNum: 1] > Start time: Thu Dec 19 16:51:35 UTC 2024 > > User: anonymous (shared mode: USER) > State: STARTED > > ----------- > > However, when the engine is called with thrift, the connection times out. > These are the relevant logs from kyuubi-beeline: > > ----------- > > 2024-12-19 17:11:26.998 ERROR KyuubiSessionManager-exec-pool: Thread-93 > org.apache.kyuubi.session.KyuubiSessionImpl: Opening engine > [kyuubi_USER_SPARK_SQL_anonymous_default_64dceadd-595d-402a-adc2-4e0a6d31fdf6 > 10.3.173.201:37613] for anonymous session failed > org.apache.kyuubi.shaded.thrift.transport.TTransportException: > java.net.SocketTimeoutException: Connect timed out > at > org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:255) > at > org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.open(TSaslTransport.java:233) > at > org.apache.kyuubi.shaded.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39) > at > org.apache.kyuubi.client.KyuubiSyncThriftClient$.createTProtocol(KyuubiSyncThriftClient.scala:478) > at > org.apache.kyuubi.client.KyuubiSyncThriftClient$.createClient(KyuubiSyncThriftClient.scala:495) > at > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2(KyuubiSessionImpl.scala:177) > at > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2$adapted(KyuubiSessionImpl.scala:134) > at > org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36) > at > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:134) > at > org.apache.kyuubi.session.KyuubiSession.handleSessionException(KyuubiSession.scala:49) > at > org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:134) > at > org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at java.base/java.lang.Thread.run(Thread.java:840) > Caused by: java.net.SocketTimeoutException: Connect timed out > at > java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:551) > at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:602) > at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327) > at java.base/java.net.Socket.connect(Socket.java:633) > at > org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:250) > ... > Error: org.apache.kyuubi.KyuubiSQLException: Error operating LaunchEngine: > org.apache.kyuubi.shaded.thrift.transport.TTransportException: > java.net.SocketTimeoutException: Connect timed out > at > org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:255) > at > org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.open(TSaslTransport.java:233) > at > org.apache.kyuubi.shaded.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39) > at > org.apache.kyuubi.client.KyuubiSyncThriftClient$.createTProtocol(KyuubiSyncThriftClient.scala:478) > at > org.apache.kyuubi.client.KyuubiSyncThriftClient$.createClient(KyuubiSyncThriftClient.scala:495) > at > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2(KyuubiSessionImpl.scala:177) > at > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2$adapted(KyuubiSessionImpl.scala:134) > at > org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36) > at > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:134) > at > org.apache.kyuubi.session.KyuubiSession.handleSessionException(KyuubiSession.scala:49) > at > org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:134) > at > org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:60) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at java.base/java.lang.Thread.run(Thread.java:840) > Caused by: java.net.SocketTimeoutException: Connect timed out > at > java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:551) > at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:602) > at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327) > at java.base/java.net.Socket.connect(Socket.java:633) > at > org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:250) > ... 16 more > > at > org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69) > at > org.apache.kyuubi.operation.KyuubiOperation$$anonfun$onError$1.$anonfun$applyOrElse$1(KyuubiOperation.scala:94) > at > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at org.apache.kyuubi.Utils$.withLockRequired(Utils.scala:392) > at > org.apache.kyuubi.operation.AbstractOperation.withLockRequired(AbstractOperation.scala:52) > at > org.apache.kyuubi.operation.KyuubiOperation$$anonfun$onError$1.applyOrElse(KyuubiOperation.scala:78) > at > org.apache.kyuubi.operation.KyuubiOperation$$anonfun$onError$1.applyOrElse(KyuubiOperation.scala:75) > at > scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38) > at > org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:62) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at java.base/java.lang.Thread.run(Thread.java:840) > Caused by: org.apache.kyuubi.shaded.thrift.transport.TTransportException: > java.net.SocketTimeoutException: Connect timed out > at > org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:255) > at > org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.open(TSaslTransport.java:233) > at > org.apache.kyuubi.shaded.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39) > at > org.apache.kyuubi.client.KyuubiSyncThriftClient$.createTProtocol(KyuubiSyncThriftClient.scala:478) > at > org.apache.kyuubi.client.KyuubiSyncThriftClient$.createClient(KyuubiSyncThriftClient.scala:495) > at > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2(KyuubiSessionImpl.scala:177) > at > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2$adapted(KyuubiSessionImpl.scala:134) > at > org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36) > at > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:134) > at > org.apache.kyuubi.session.KyuubiSession.handleSessionException(KyuubiSession.scala:49) > at > org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:134) > at > org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:60) > ... 5 more > Caused by: java.net.SocketTimeoutException: Connect timed out > at > java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:551) > at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:602) > at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327) > at java.base/java.net.Socket.connect(Socket.java:633) > at > org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:250) > ... 16 more (state=,code=0) > > ---------- > > The engine continues running and subsequent beeline calls return the same > timeout error. Java 17, Scala 2.12, Kyuubi 1.10, Spark 3.5.3, > running on AARCH64. Any thoughts or suggestions? > > Thanks, > Aaron