zxl-333 commented on issue #6890: URL: https://github.com/apache/kyuubi/issues/6890#issuecomment-2594802403
> Did you repeatedly configure spark_catalog_ky catalog? You should remove `spark.sql.catalog.spark_catalog_ky=org.apache.kyuubi.spark.connector.hive.HiveTableCatalog` > >  When I remove the spark.sql.catalog.spark_catalog_ky=org.apache.kyuubi.spark.connector.hive.HiveTableCatalog Spark. The hive. HiveTableCatalog, throws connection metastore failed anomalies **first method -----------------------** **spark-defaults.conf** #---------------begin-------------- spark.kerberos.access.hadoopFileSystems hdfs://myns,hdfs://mynsbackup spark.sql.catalog.hive_catalog org.apache.kyuubi.spark.connector.hive.HiveTableCatalog spark.sql.catalog.hive_catalog.hive.metastore.uris thrift://bigdata-1734405115-lhalh:9083,thrift://bigdata-1734405115-0xt70:9083 #spark.sql.catalog.hive_catalog.hive.metastore.token.signature=thrift://bigdata-1734405115-lhalh:9083,thrift://bigdata-1734405115-0xt70:9083 #local cluster metastore spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog #spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkCatalog spark.sql.catalog.spark_catalog.type=hive spark.sql.catalog.spark_catalog.uri=thrift://bigdata-1734358521-u7gjy:9083,thrift://bigdata-1734358521-gsy9x:9083 #配置下面是为了解决当前iceberg的uri和HiveConf.ConfVars.METASTOREURIS不相等问题 #spark.sql.catalog.spark_catalog org.apache.kyuubi.spark.connector.hive.HiveTableCatalog #spark.sql.catalog.spark_catalog.hive.metastore.uris thrift://bigdata-1734358521-u7gjy:9083,thrift://bigdata-1734358521-gsy9x:9083 #an other cluster metastore **spark.sql.catalog.spark_catalog_ky=org.apache.iceberg.spark.SparkCatalog** spark.sql.catalog.spark_catalog_ky.type=hive spark.sql.catalog.spark_catalog_ky.uri=thrift://bigdata-1734405115-lhalh:9083 **#spark.sql.catalog.spark_catalog_ky=org.apache.kyuubi.spark.connector.hive.HiveTableCatalog** #spark.sql.catalog.spark_catalog_ky.hive.metastore.uris=thrift://bigdata-1734405115-lhalh:9083,thrift://bigdata-1734405115-0xt70:9083 #spark.sql.catalog.spark_catalog_ky.hive.metastore.kerberos.principal=hive/_h...@mr.0c20903122394b4293a44ead5cd1a27e.yun.cn #spark.sql.catalog.spark_catalog_ky.hive.metastore.sasl.enabled=true #spark.sql.catalog.spark_catalog_ky.hive.metastore.token.signature=thrift://bigdata-1734405115-lhalh:9083,thrift://bigdata-1734405115-0xt70:9083 #--------------end---------------- **exception:** 25/01/16 15:57:42 INFO metastore: Trying to connect to metastore with URI thrift://bigdata-1734405115-lhalh:9083 25/01/16 15:57:42 WARN metastore: Failed to connect to the MetaStore Server... 25/01/16 15:57:42 INFO metastore: Waiting 3 seconds before next connection attempt. 2025-01-16 15:57:45.449 INFO KyuubiSessionManager-exec-pool: Thread-93 org.apache.kyuubi.operation.ExecuteStatement: Query[f7e6ec20-2ec4-47dc-be3a-1989a3b46bfd] in RUNNING_STATE 25/01/16 15:57:45 INFO DAGScheduler: Asked to cancel job group f7e6ec20-2ec4-47dc-be3a-1989a3b46bfd 25/01/16 15:57:45 ERROR ExecuteStatement: Error operating ExecuteStatement: org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive Metastore at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:84) at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:34) at org.apache.iceberg.ClientPoolImpl.get(ClientPoolImpl.java:125) at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:56) at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51) at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:122) at org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:158) at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:97) at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:80) at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:47) at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406) at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853) at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404) at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387) at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108) at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62) at org.apache.iceberg.CachingCatalog.loadTable(CachingCatalog.java:166) at org.apache.iceberg.spark.SparkCatalog.load(SparkCatalog.java:642) at org.apache.iceberg.spark.SparkCatalog.loadTable(SparkCatalog.java:160) at org.apache.spark.sql.connector.catalog.CatalogV2Util$.loadTable(CatalogV2Util.scala:311) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.$anonfun$lookupRelation$3(Analyzer.scala:1197) at scala.Option.orElse(Option.scala:447) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.$anonfun$lookupRelation$1(Analyzer.scala:1196) at scala.Option.orElse(Option.scala:447) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupRelation(Analyzer.scala:1188) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$13.applyOrElse(Analyzer.scala:1059) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$13.applyOrElse(Analyzer.scala:1023) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:176) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1228) at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1227) at org.apache.spark.sql.catalyst.plans.logical.Aggregate.mapChildren(basicLogicalOperators.scala:977) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:1023) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:982) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$2(RuleExecutor.scala:211) at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126) at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122) at scala.collection.immutable.List.foldLeft(List.scala:91) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:208) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1$adapted(RuleExecutor.scala:200) at scala.collection.immutable.List.foreach(List.scala:431) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:200) at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:231) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:227) at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:173) at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:227) at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:188) at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:88) at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179) at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:212) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330) at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:211) at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:76) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:185) at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:510) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:185) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:184) at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:76) at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74) at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:66) at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:98) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96) at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:622) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617) at org.apache.kyuubi.engine.spark.operation.ExecuteStatement.$anonfun$executeStatement$1(ExecuteStatement.scala:86) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.kyuubi.engine.spark.operation.SparkOperation.$anonfun$withLocalProperties$1(SparkOperation.scala:147) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169) at org.apache.kyuubi.engine.spark.operation.SparkOperation.withLocalProperties(SparkOperation.scala:131) at org.apache.kyuubi.engine.spark.operation.ExecuteStatement.executeStatement(ExecuteStatement.scala:81) at org.apache.kyuubi.engine.spark.operation.ExecuteStatement$$anon$1.run(ExecuteStatement.scala:103) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1742) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:83) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:133) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:97) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.iceberg.common.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:60) at org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:72) at org.apache.iceberg.common.DynMethods$StaticMethod.invoke(DynMethods.java:185) at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:63) ... 91 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1740) ... 103 more **second method --------------** When both of my catalogs are configured, it is normal to connect to metastore. However, when reading tables, kyuubi spark hive scan is always used, and the iceberg scan cannot be used to query data #---------------begin-------------- spark.kerberos.access.hadoopFileSystems hdfs://myns,hdfs://mynsbackup spark.sql.catalog.hive_catalog org.apache.kyuubi.spark.connector.hive.HiveTableCatalog spark.sql.catalog.hive_catalog.hive.metastore.uris thrift://bigdata-1734405115-lhalh:9083,thrift://bigdata-1734405115-0xt70:9083 #spark.sql.catalog.hive_catalog.hive.metastore.token.signature=thrift://bigdata-1734405115-lhalh:9083,thrift://bigdata-1734405115-0xt70:9083 #local cluster metastore spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog #spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkCatalog spark.sql.catalog.spark_catalog.type=hive spark.sql.catalog.spark_catalog.uri=thrift://bigdata-1734358521-u7gjy:9083,thrift://bigdata-1734358521-gsy9x:9083 #配置下面是为了解决当前iceberg的uri和HiveConf.ConfVars.METASTOREURIS不相等问题 #spark.sql.catalog.spark_catalog org.apache.kyuubi.spark.connector.hive.HiveTableCatalog #spark.sql.catalog.spark_catalog.hive.metastore.uris thrift://bigdata-1734358521-u7gjy:9083,thrift://bigdata-1734358521-gsy9x:9083 #an other cluster metastore **spark.sql.catalog.spark_catalog_ky=org.apache.iceberg.spark.SparkCatalog** spark.sql.catalog.spark_catalog_ky.type=hive spark.sql.catalog.spark_catalog_ky.uri=thrift://bigdata-1734405115-lhalh:9083 **spark.sql.catalog.spark_catalog_ky=org.apache.kyuubi.spark.connector.hive.HiveTableCatalog** #spark.sql.catalog.spark_catalog_ky.hive.metastore.uris=thrift://bigdata-1734405115-lhalh:9083,thrift://bigdata-1734405115-0xt70:9083 #spark.sql.catalog.spark_catalog_ky.hive.metastore.kerberos.principal=hive/_h...@mr.0c20903122394b4293a44ead5cd1a27e.yun.cn #spark.sql.catalog.spark_catalog_ky.hive.metastore.sasl.enabled=true #spark.sql.catalog.spark_catalog_ky.hive.metastore.token.signature=thrift://bigdata-1734405115-lhalh:9083,thrift://bigdata-1734405115-0xt70:9083 #--------------end---------------- The sql execution plan is as follows: | == Physical Plan == AdaptiveSparkPlan isFinalPlan=false +- HashAggregate(keys=[k#32], functions=[count(1)]) +- Exchange hashpartitioning(k#32, 800), ENSURE_REQUIREMENTS, [plan_id=65] +- HashAggregate(keys=[k#32], functions=[partial_count(1)]) +- Project [k#32] +- BatchScan[k#32] HiveScan DataFilters: [], Format: hive, Location: HiveCatalogFileIndex(1 paths)[hdfs://mynsbackup/warehouse/tablespace/managed/hive/test_iceberg..., PartitionFilters: [], ReadSchema: struct<k:string> RuntimeFilters: [] | -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@kyuubi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: notifications-unsubscr...@kyuubi.apache.org For additional commands, e-mail: notifications-h...@kyuubi.apache.org