It sounds like that your Kyuubi Server can not access your K8s Pod IP(where 
Spark driver lives), if so, this is mostly a network infrastructure issue.

Thanks,
Cheng Pan



> On Dec 20, 2024, at 01:34, Aaron Grubb <aa...@kaden.ai> wrote:
> 
> Hello,
> 
> I'm trying to set up Kyuubi as a JDBC gateway to Spark on Kubernetes with 
> cluster deploy mode. I'm at the point where the driver is running
> in Kubernetes and the engine is loaded - relevant logs from the driver here:
> 
> ------------
> 
> 24/12/19 16:51:36 INFO SparkContext: Added JAR 
> file:/tmp/spark-f15bf2f2-1f9a-4cdd-b7b3-9bf1bd228454/kyuubi-spark-sql-engine_2.12-1.10.0.jar
> at 
> spark://spark-e8895193dfd364d7-driver-svc.spark-dev.svc:7078/jars/kyuubi-spark-sql-engine_2.12-1.10.0.jar
>  with timestamp 1734627095298
> ...
> 24/12/19 16:53:37 INFO 
> KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Registered 
> executor NettyRpcEndpointRef(spark-
> client://Executor) (10.3.23.80:41846) with ID 1,  ResourceProfileId 0
> 24/12/19 16:53:37 INFO ExecutorMonitor: New executor 1 has registered (new 
> total is 1)
> 24/12/19 16:53:37 INFO BlockManagerMasterEndpoint: Registering block manager 
> 10.3.23.80:44715 with 5.9 GiB RAM, BlockManagerId(1, 10.3.23.80,
> 44715, None)
> 24/12/19 16:53:37 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0) 
> (10.3.23.80, executor 1, partition 0, PROCESS_LOCAL, 9164
> bytes) 
> 24/12/19 16:53:37 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory 
> on 10.3.23.80:44715 (size: 3.6 KiB, free: 5.9 GiB)
> 24/12/19 16:53:38 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) 
> in 925 ms on 10.3.23.80 (executor 1) (1/1)
> 24/12/19 16:53:38 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks 
> have all completed, from pool 
> 24/12/19 16:53:38 INFO DAGScheduler: ResultStage 0 (isEmpty at 
> KyuubiSparkUtil.scala:51) finished in 106.477 s
> 24/12/19 16:53:38 INFO DAGScheduler: Job 0 is finished. Cancelling potential 
> speculative or zombie tasks for this job
> 24/12/19 16:53:38 INFO TaskSchedulerImpl: Killing all running tasks in stage 
> 0: Stage finished
> 24/12/19 16:53:38 INFO DAGScheduler: Job 0 finished: isEmpty at 
> KyuubiSparkUtil.scala:51, took 106.600060 s
> 24/12/19 16:53:38 INFO ThreadUtils: SparkSQLSessionManager-exec-pool: pool 
> size: 100, wait queue size: 100, thread keepalive time: 60000 ms
> 24/12/19 16:53:38 INFO SparkSQLOperationManager: 
> Service[SparkSQLOperationManager] is initialized.
> 24/12/19 16:53:38 INFO SparkSQLSessionManager: 
> Service[SparkSQLSessionManager] is initialized.
> 24/12/19 16:53:38 INFO SparkSQLBackendService: 
> Service[SparkSQLBackendService] is initialized.
> 24/12/19 16:53:38 INFO SparkTBinaryFrontendService: Initializing 
> SparkTBinaryFrontend on kyuubi-user-spark-sql-anonymous-default-0a7bfe5b-
> 3506-433d-8e11:37613 with [9, 999] worker threads
> ...
> 24/12/19 16:53:38 INFO ClientCnxn: Opening socket connection to server 
> 10.3.136.134/10.3.136.134:2181. Will not attempt to authenticate using
> SASL (unknown error)
> 24/12/19 16:53:38 INFO EngineServiceDiscovery: 
> Service[EngineServiceDiscovery] is initialized.
> 24/12/19 16:53:38 INFO SparkTBinaryFrontendService: 
> Service[SparkTBinaryFrontend] is initialized.
> 24/12/19 16:53:38 INFO SparkSQLEngine: Service[SparkSQLEngine] is initialized.
> 24/12/19 16:53:38 INFO ClientCnxn: Socket connection established to 
> 10.3.136.134/10.3.136.134:2181, initiating session
> 24/12/19 16:53:38 INFO ClientCnxn: Session establishment complete on server 
> 10.3.136.134/10.3.136.134:2181, sessionid = 0x10000005d9319dc,
> negotiated timeout = 40000
> 24/12/19 16:53:38 INFO ConnectionStateManager: State change: CONNECTED
> 24/12/19 16:53:38 INFO ZookeeperDiscoveryClient: Zookeeper client connection 
> state changed to: CONNECTED
> 24/12/19 16:53:38 INFO SparkSQLOperationManager: 
> Service[SparkSQLOperationManager] is started.
> 24/12/19 16:53:38 INFO SparkSQLSessionManager: 
> Service[SparkSQLSessionManager] is started.
> 24/12/19 16:53:38 INFO SparkSQLBackendService: 
> Service[SparkSQLBackendService] is started.
> 24/12/19 16:53:39 INFO ZookeeperDiscoveryClient: Created a
> /kyuubi_1.10.0_USER_SPARK_SQL/anonymous/default/serverUri=10.3.173.201:37613;version=1.10.0;kyuubi.engine.appMgrInfo=eyJyZXNvdXJjZU1hbmFnZXIi
> OiJrOHM6Ly9odHRwczovLzI1ODY1MzZDM0FBQzNGNEY4NTFCRkNCMUQ4QUQzODNDLmdyNy51cy1lYXN0LTEuZWtzLmFtYXpvbmF3cy5jb20iLCJrdWJlcm5ldGVzSW5mbyI6eyJjb250Z
> Xh0IjpudWxsLCJuYW1lc3BhY2UiOiJzcGFyay1kZXYifX0=;kyuubi.engine.id=spark-af129813d4f5458590fc46f19f754643;kyuubi.engine.url=spark-
> e8895193dfd364d7-driver-svc.spark-dev.svc:4040;spark.driver.memory=5140m;spark.executor.memory=10350m;refId=0a7bfe5b-3506-433d-8e11-
> 0db01f269f23;sequence=0000000000 on ZooKeeper for KyuubiServer uri: 
> 10.3.173.201:37613
> 24/12/19 16:53:39 INFO EngineServiceDiscovery: Registered 
> EngineServiceDiscovery in namespace
> /kyuubi_1.10.0_USER_SPARK_SQL/anonymous/default.
> 24/12/19 16:53:39 INFO EngineServiceDiscovery: 
> Service[EngineServiceDiscovery] is started.
> 24/12/19 16:53:39 INFO SparkTBinaryFrontendService: 
> Service[SparkTBinaryFrontend] is started.
> 24/12/19 16:53:39 INFO SparkSQLEngine: Service[SparkSQLEngine] is started.
> 24/12/19 16:53:39 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
> 10.3.23.80:44715 in memory (size: 3.6 KiB, free: 5.9 GiB)
> 24/12/19 16:53:39 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
> spark-e8895193dfd364d7-driver-svc.spark-dev.svc:7079 in memory (size:
> 3.6 KiB, free: 2.7 GiB)
> 24/12/19 16:53:40 INFO SparkSQLEngine: 
>    Spark application name: 
> kyuubi_USER_SPARK_SQL_anonymous_default_0a7bfe5b-3506-433d-8e11-0db01f269f23
>          application ID:  spark-af129813d4f5458590fc46f19f754643
>          application tags: 
>          application web UI: 
> http://spark-e8895193dfd364d7-driver-svc.spark-dev.svc:4040
>          master: XXX
>          version: 3.5.3
>          driver: [cpu: 1, mem: 5140m]
>          executor: [cpu: 4, mem: 10350m, maxNum: 1]
>    Start time: Thu Dec 19 16:51:35 UTC 2024
> 
>    User: anonymous (shared mode: USER)
>    State: STARTED
> 
> -----------
> 
> However, when the engine is called with thrift, the connection times out. 
> These are the relevant logs from kyuubi-beeline:
> 
> -----------
> 
> 2024-12-19 17:11:26.998 ERROR KyuubiSessionManager-exec-pool: Thread-93 
> org.apache.kyuubi.session.KyuubiSessionImpl: Opening engine
> [kyuubi_USER_SPARK_SQL_anonymous_default_64dceadd-595d-402a-adc2-4e0a6d31fdf6 
> 10.3.173.201:37613] for anonymous session failed
> org.apache.kyuubi.shaded.thrift.transport.TTransportException: 
> java.net.SocketTimeoutException: Connect timed out
>       at 
> org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:255)
>       at 
> org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.open(TSaslTransport.java:233)
>       at 
> org.apache.kyuubi.shaded.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39)
>       at 
> org.apache.kyuubi.client.KyuubiSyncThriftClient$.createTProtocol(KyuubiSyncThriftClient.scala:478)
>       at 
> org.apache.kyuubi.client.KyuubiSyncThriftClient$.createClient(KyuubiSyncThriftClient.scala:495)
>       at 
> org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2(KyuubiSessionImpl.scala:177)
>       at 
> org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2$adapted(KyuubiSessionImpl.scala:134)
>       at 
> org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36)
>       at 
> org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:134)
>       at 
> org.apache.kyuubi.session.KyuubiSession.handleSessionException(KyuubiSession.scala:49)
>       at 
> org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:134)
>       at 
> org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:60)
>       at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>       at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>       at java.base/java.lang.Thread.run(Thread.java:840)
> Caused by: java.net.SocketTimeoutException: Connect timed out
>       at 
> java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:551)
>       at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:602)
>       at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
>       at java.base/java.net.Socket.connect(Socket.java:633)
>       at 
> org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:250)
> ...
> Error: org.apache.kyuubi.KyuubiSQLException: Error operating LaunchEngine: 
> org.apache.kyuubi.shaded.thrift.transport.TTransportException:
> java.net.SocketTimeoutException: Connect timed out
>       at 
> org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:255)
>       at 
> org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.open(TSaslTransport.java:233)
>       at 
> org.apache.kyuubi.shaded.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39)
>       at 
> org.apache.kyuubi.client.KyuubiSyncThriftClient$.createTProtocol(KyuubiSyncThriftClient.scala:478)
>       at 
> org.apache.kyuubi.client.KyuubiSyncThriftClient$.createClient(KyuubiSyncThriftClient.scala:495)
>       at 
> org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2(KyuubiSessionImpl.scala:177)
>       at 
> org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2$adapted(KyuubiSessionImpl.scala:134)
>       at 
> org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36)
>       at 
> org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:134)
>       at 
> org.apache.kyuubi.session.KyuubiSession.handleSessionException(KyuubiSession.scala:49)
>       at 
> org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:134)
>       at 
> org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:60)
>       at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>       at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>       at java.base/java.lang.Thread.run(Thread.java:840)
> Caused by: java.net.SocketTimeoutException: Connect timed out
>       at 
> java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:551)
>       at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:602)
>       at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
>       at java.base/java.net.Socket.connect(Socket.java:633)
>       at 
> org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:250)
>       ... 16 more
> 
>       at 
> org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69)
>       at 
> org.apache.kyuubi.operation.KyuubiOperation$$anonfun$onError$1.$anonfun$applyOrElse$1(KyuubiOperation.scala:94)
>       at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
>       at org.apache.kyuubi.Utils$.withLockRequired(Utils.scala:392)
>       at 
> org.apache.kyuubi.operation.AbstractOperation.withLockRequired(AbstractOperation.scala:52)
>       at 
> org.apache.kyuubi.operation.KyuubiOperation$$anonfun$onError$1.applyOrElse(KyuubiOperation.scala:78)
>       at 
> org.apache.kyuubi.operation.KyuubiOperation$$anonfun$onError$1.applyOrElse(KyuubiOperation.scala:75)
>       at 
> scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38)
>       at 
> org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:62)
>       at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>       at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>       at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>       at java.base/java.lang.Thread.run(Thread.java:840)
> Caused by: org.apache.kyuubi.shaded.thrift.transport.TTransportException: 
> java.net.SocketTimeoutException: Connect timed out
>       at 
> org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:255)
>       at 
> org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.open(TSaslTransport.java:233)
>       at 
> org.apache.kyuubi.shaded.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39)
>       at 
> org.apache.kyuubi.client.KyuubiSyncThriftClient$.createTProtocol(KyuubiSyncThriftClient.scala:478)
>       at 
> org.apache.kyuubi.client.KyuubiSyncThriftClient$.createClient(KyuubiSyncThriftClient.scala:495)
>       at 
> org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2(KyuubiSessionImpl.scala:177)
>       at 
> org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2$adapted(KyuubiSessionImpl.scala:134)
>       at 
> org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36)
>       at 
> org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:134)
>       at 
> org.apache.kyuubi.session.KyuubiSession.handleSessionException(KyuubiSession.scala:49)
>       at 
> org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:134)
>       at 
> org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:60)
>       ... 5 more
> Caused by: java.net.SocketTimeoutException: Connect timed out
>       at 
> java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:551)
>       at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:602)
>       at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327)
>       at java.base/java.net.Socket.connect(Socket.java:633)
>       at 
> org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:250)
>       ... 16 more (state=,code=0)
> 
> ----------
> 
> The engine continues running and subsequent beeline calls return the same 
> timeout error. Java 17, Scala 2.12, Kyuubi 1.10, Spark 3.5.3,
> running on AARCH64. Any thoughts or suggestions?
> 
> Thanks,
> Aaron

Reply via email to