Thanks for the reply! Indeed this was an issue with external traffic going into our K8s cluster. I switched to using the Helm chart as opposed to running the server externally to K8s and everything works great! Thanks a lot to everyone for a great and very useful product.
On Fri, 2024-12-20 at 18:25 +0800, Cheng Pan wrote: > It sounds like that your Kyuubi Server can not access your K8s Pod IP(where > Spark driver lives), if so, this is mostly a network > infrastructure issue. > > Thanks, > Cheng Pan > > > > > On Dec 20, 2024, at 01:34, Aaron Grubb <aa...@kaden.ai> wrote: > > > > Hello, > > > > I'm trying to set up Kyuubi as a JDBC gateway to Spark on Kubernetes with > > cluster deploy mode. I'm at the point where the driver is > > running > > in Kubernetes and the engine is loaded - relevant logs from the driver here: > > > > ------------ > > > > 24/12/19 16:51:36 INFO SparkContext: Added JAR > > file:/tmp/spark-f15bf2f2-1f9a-4cdd-b7b3-9bf1bd228454/kyuubi-spark-sql-engine_2.12- > > 1.10.0.jar > > at > > spark://spark-e8895193dfd364d7-driver-svc.spark-dev.svc:7078/jars/kyuubi-spark-sql-engine_2.12-1.10.0.jar > > with timestamp 1734627095298 > > ... > > 24/12/19 16:53:37 INFO > > KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Registered > > executor NettyRpcEndpointRef(spark- > > client://Executor) (10.3.23.80:41846) with ID 1, ResourceProfileId 0 > > 24/12/19 16:53:37 INFO ExecutorMonitor: New executor 1 has registered (new > > total is 1) > > 24/12/19 16:53:37 INFO BlockManagerMasterEndpoint: Registering block > > manager 10.3.23.80:44715 with 5.9 GiB RAM, BlockManagerId(1, > > 10.3.23.80, > > 44715, None) > > 24/12/19 16:53:37 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID > > 0) (10.3.23.80, executor 1, partition 0, PROCESS_LOCAL, 9164 > > bytes) > > 24/12/19 16:53:37 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory > > on 10.3.23.80:44715 (size: 3.6 KiB, free: 5.9 GiB) > > 24/12/19 16:53:38 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID > > 0) in 925 ms on 10.3.23.80 (executor 1) (1/1) > > 24/12/19 16:53:38 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks > > have all completed, from pool > > 24/12/19 16:53:38 INFO DAGScheduler: ResultStage 0 (isEmpty at > > KyuubiSparkUtil.scala:51) finished in 106.477 s > > 24/12/19 16:53:38 INFO DAGScheduler: Job 0 is finished. Cancelling > > potential speculative or zombie tasks for this job > > 24/12/19 16:53:38 INFO TaskSchedulerImpl: Killing all running tasks in > > stage 0: Stage finished > > 24/12/19 16:53:38 INFO DAGScheduler: Job 0 finished: isEmpty at > > KyuubiSparkUtil.scala:51, took 106.600060 s > > 24/12/19 16:53:38 INFO ThreadUtils: SparkSQLSessionManager-exec-pool: pool > > size: 100, wait queue size: 100, thread keepalive time: 60000 > > ms > > 24/12/19 16:53:38 INFO SparkSQLOperationManager: > > Service[SparkSQLOperationManager] is initialized. > > 24/12/19 16:53:38 INFO SparkSQLSessionManager: > > Service[SparkSQLSessionManager] is initialized. > > 24/12/19 16:53:38 INFO SparkSQLBackendService: > > Service[SparkSQLBackendService] is initialized. > > 24/12/19 16:53:38 INFO SparkTBinaryFrontendService: Initializing > > SparkTBinaryFrontend on kyuubi-user-spark-sql-anonymous-default- > > 0a7bfe5b- > > 3506-433d-8e11:37613 with [9, 999] worker threads > > ... > > 24/12/19 16:53:38 INFO ClientCnxn: Opening socket connection to server > > 10.3.136.134/10.3.136.134:2181. Will not attempt to authenticate > > using > > SASL (unknown error) > > 24/12/19 16:53:38 INFO EngineServiceDiscovery: > > Service[EngineServiceDiscovery] is initialized. > > 24/12/19 16:53:38 INFO SparkTBinaryFrontendService: > > Service[SparkTBinaryFrontend] is initialized. > > 24/12/19 16:53:38 INFO SparkSQLEngine: Service[SparkSQLEngine] is > > initialized. > > 24/12/19 16:53:38 INFO ClientCnxn: Socket connection established to > > 10.3.136.134/10.3.136.134:2181, initiating session > > 24/12/19 16:53:38 INFO ClientCnxn: Session establishment complete on server > > 10.3.136.134/10.3.136.134:2181, sessionid = > > 0x10000005d9319dc, > > negotiated timeout = 40000 > > 24/12/19 16:53:38 INFO ConnectionStateManager: State change: CONNECTED > > 24/12/19 16:53:38 INFO ZookeeperDiscoveryClient: Zookeeper client > > connection state changed to: CONNECTED > > 24/12/19 16:53:38 INFO SparkSQLOperationManager: > > Service[SparkSQLOperationManager] is started. > > 24/12/19 16:53:38 INFO SparkSQLSessionManager: > > Service[SparkSQLSessionManager] is started. > > 24/12/19 16:53:38 INFO SparkSQLBackendService: > > Service[SparkSQLBackendService] is started. > > 24/12/19 16:53:39 INFO ZookeeperDiscoveryClient: Created a > > /kyuubi_1.10.0_USER_SPARK_SQL/anonymous/default/serverUri=10.3.173.201:37613;version=1.10.0;kyuubi.engine.appMgrInfo=eyJyZXNvdXJjZU1hbmFn > > ZXIi > > OiJrOHM6Ly9odHRwczovLzI1ODY1MzZDM0FBQzNGNEY4NTFCRkNCMUQ4QUQzODNDLmdyNy51cy1lYXN0LTEuZWtzLmFtYXpvbmF3cy5jb20iLCJrdWJlcm5ldGVzSW5mbyI6eyJjb > > 250Z > > Xh0IjpudWxsLCJuYW1lc3BhY2UiOiJzcGFyay1kZXYifX0=;kyuubi.engine.id=spark-af129813d4f5458590fc46f19f754643;kyuubi.engine.url=spark- > > e8895193dfd364d7-driver-svc.spark-dev.svc:4040;spark.driver.memory=5140m;spark.executor.memory=10350m;refId=0a7bfe5b-3506-433d-8e11- > > 0db01f269f23;sequence=0000000000 on ZooKeeper for KyuubiServer uri: > > 10.3.173.201:37613 > > 24/12/19 16:53:39 INFO EngineServiceDiscovery: Registered > > EngineServiceDiscovery in namespace > > /kyuubi_1.10.0_USER_SPARK_SQL/anonymous/default. > > 24/12/19 16:53:39 INFO EngineServiceDiscovery: > > Service[EngineServiceDiscovery] is started. > > 24/12/19 16:53:39 INFO SparkTBinaryFrontendService: > > Service[SparkTBinaryFrontend] is started. > > 24/12/19 16:53:39 INFO SparkSQLEngine: Service[SparkSQLEngine] is started. > > 24/12/19 16:53:39 INFO BlockManagerInfo: Removed broadcast_0_piece0 on > > 10.3.23.80:44715 in memory (size: 3.6 KiB, free: 5.9 GiB) > > 24/12/19 16:53:39 INFO BlockManagerInfo: Removed broadcast_0_piece0 on > > spark-e8895193dfd364d7-driver-svc.spark-dev.svc:7079 in memory > > (size: > > 3.6 KiB, free: 2.7 GiB) > > 24/12/19 16:53:40 INFO SparkSQLEngine: > > Spark application name: > > kyuubi_USER_SPARK_SQL_anonymous_default_0a7bfe5b-3506-433d-8e11-0db01f269f23 > > application ID: spark-af129813d4f5458590fc46f19f754643 > > application tags: > > application web UI: > > http://spark-e8895193dfd364d7-driver-svc.spark-dev.svc:4040 > > master: XXX > > version: 3.5.3 > > driver: [cpu: 1, mem: 5140m] > > executor: [cpu: 4, mem: 10350m, maxNum: 1] > > Start time: Thu Dec 19 16:51:35 UTC 2024 > > > > User: anonymous (shared mode: USER) > > State: STARTED > > > > ----------- > > > > However, when the engine is called with thrift, the connection times out. > > These are the relevant logs from kyuubi-beeline: > > > > ----------- > > > > 2024-12-19 17:11:26.998 ERROR KyuubiSessionManager-exec-pool: Thread-93 > > org.apache.kyuubi.session.KyuubiSessionImpl: Opening engine > > [kyuubi_USER_SPARK_SQL_anonymous_default_64dceadd-595d-402a-adc2-4e0a6d31fdf6 > > 10.3.173.201:37613] for anonymous session failed > > org.apache.kyuubi.shaded.thrift.transport.TTransportException: > > java.net.SocketTimeoutException: Connect timed out > > at > > org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:255) > > at > > org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.open(TSaslTransport.java:233) > > at > > org.apache.kyuubi.shaded.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39) > > at > > org.apache.kyuubi.client.KyuubiSyncThriftClient$.createTProtocol(KyuubiSyncThriftClient.scala:478) > > at > > org.apache.kyuubi.client.KyuubiSyncThriftClient$.createClient(KyuubiSyncThriftClient.scala:495) > > at > > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2(KyuubiSessionImpl.scala:177) > > at > > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2$adapted(KyuubiSessionImpl.scala:134) > > at > > org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36) > > at > > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:134) > > at > > org.apache.kyuubi.session.KyuubiSession.handleSessionException(KyuubiSession.scala:49) > > at > > org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:134) > > at > > org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:60) > > at > > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) > > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > > at > > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > > at > > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > > at java.base/java.lang.Thread.run(Thread.java:840) > > Caused by: java.net.SocketTimeoutException: Connect timed out > > at > > java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:551) > > at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:602) > > at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327) > > at java.base/java.net.Socket.connect(Socket.java:633) > > at > > org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:250) > > ... > > Error: org.apache.kyuubi.KyuubiSQLException: Error operating LaunchEngine: > > org.apache.kyuubi.shaded.thrift.transport.TTransportException: > > java.net.SocketTimeoutException: Connect timed out > > at > > org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:255) > > at > > org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.open(TSaslTransport.java:233) > > at > > org.apache.kyuubi.shaded.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39) > > at > > org.apache.kyuubi.client.KyuubiSyncThriftClient$.createTProtocol(KyuubiSyncThriftClient.scala:478) > > at > > org.apache.kyuubi.client.KyuubiSyncThriftClient$.createClient(KyuubiSyncThriftClient.scala:495) > > at > > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2(KyuubiSessionImpl.scala:177) > > at > > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2$adapted(KyuubiSessionImpl.scala:134) > > at > > org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36) > > at > > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:134) > > at > > org.apache.kyuubi.session.KyuubiSession.handleSessionException(KyuubiSession.scala:49) > > at > > org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:134) > > at > > org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:60) > > at > > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) > > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > > at > > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > > at > > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > > at java.base/java.lang.Thread.run(Thread.java:840) > > Caused by: java.net.SocketTimeoutException: Connect timed out > > at > > java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:551) > > at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:602) > > at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327) > > at java.base/java.net.Socket.connect(Socket.java:633) > > at > > org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:250) > > ... 16 more > > > > at > > org.apache.kyuubi.KyuubiSQLException$.apply(KyuubiSQLException.scala:69) > > at > > org.apache.kyuubi.operation.KyuubiOperation$$anonfun$onError$1.$anonfun$applyOrElse$1(KyuubiOperation.scala:94) > > at > > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > > at org.apache.kyuubi.Utils$.withLockRequired(Utils.scala:392) > > at > > org.apache.kyuubi.operation.AbstractOperation.withLockRequired(AbstractOperation.scala:52) > > at > > org.apache.kyuubi.operation.KyuubiOperation$$anonfun$onError$1.applyOrElse(KyuubiOperation.scala:78) > > at > > org.apache.kyuubi.operation.KyuubiOperation$$anonfun$onError$1.applyOrElse(KyuubiOperation.scala:75) > > at > > scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38) > > at > > org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:62) > > at > > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) > > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > > at > > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > > at > > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > > at java.base/java.lang.Thread.run(Thread.java:840) > > Caused by: org.apache.kyuubi.shaded.thrift.transport.TTransportException: > > java.net.SocketTimeoutException: Connect timed out > > at > > org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:255) > > at > > org.apache.kyuubi.shaded.thrift.transport.TSaslTransport.open(TSaslTransport.java:233) > > at > > org.apache.kyuubi.shaded.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:39) > > at > > org.apache.kyuubi.client.KyuubiSyncThriftClient$.createTProtocol(KyuubiSyncThriftClient.scala:478) > > at > > org.apache.kyuubi.client.KyuubiSyncThriftClient$.createClient(KyuubiSyncThriftClient.scala:495) > > at > > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2(KyuubiSessionImpl.scala:177) > > at > > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$2$adapted(KyuubiSessionImpl.scala:134) > > at > > org.apache.kyuubi.ha.client.DiscoveryClientProvider$.withDiscoveryClient(DiscoveryClientProvider.scala:36) > > at > > org.apache.kyuubi.session.KyuubiSessionImpl.$anonfun$openEngineSession$1(KyuubiSessionImpl.scala:134) > > at > > org.apache.kyuubi.session.KyuubiSession.handleSessionException(KyuubiSession.scala:49) > > at > > org.apache.kyuubi.session.KyuubiSessionImpl.openEngineSession(KyuubiSessionImpl.scala:134) > > at > > org.apache.kyuubi.operation.LaunchEngine.$anonfun$runInternal$1(LaunchEngine.scala:60) > > ... 5 more > > Caused by: java.net.SocketTimeoutException: Connect timed out > > at > > java.base/sun.nio.ch.NioSocketImpl.timedFinishConnect(NioSocketImpl.java:551) > > at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:602) > > at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327) > > at java.base/java.net.Socket.connect(Socket.java:633) > > at > > org.apache.kyuubi.shaded.thrift.transport.TSocket.open(TSocket.java:250) > > ... 16 more (state=,code=0) > > > > ---------- > > > > The engine continues running and subsequent beeline calls return the same > > timeout error. Java 17, Scala 2.12, Kyuubi 1.10, Spark 3.5.3, > > running on AARCH64. Any thoughts or suggestions? > > > > Thanks, > > Aaron >