miR172 opened a new issue, #7944: URL: https://github.com/apache/iceberg/issues/7944
### Query engine I was using Spark 3.4.1 ### Question I was using Spark 3.4.1 downloaded [here](https://spark.apache.org/downloads.html) , according to [this table ](https://iceberg.apache.org/multi-engine-support/#apache-spark) I shall be able to use this iceberg-spark-runtime version: `org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.3.0` ``` ./spark-3.4.1-bin-hadoop3/bin/spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.3.0 --properties-file /home/tslocal/spark-defaults.conf ``` Above was how I started my local session. I got the following error ``` Exception in thread "main" java.lang.IllegalArgumentException: Cannot initialize FileIO, missing no-arg constructor: org.apache.iceberg.gcp.gcs.GCSFileIO at org.apache.iceberg.CatalogUtil.loadFileIO(CatalogUtil.java:312) at org.apache.iceberg.hive.HiveCatalog.initialize(HiveCatalog.java:111) at org.apache.iceberg.CatalogUtil.loadCatalog(CatalogUtil.java:239) at org.apache.iceberg.CatalogUtil.buildIcebergCatalog(CatalogUtil.java:284) at org.apache.iceberg.spark.SparkCatalog.buildIcebergCatalog(SparkCatalog.java:143) at org.apache.iceberg.spark.SparkCatalog.initialize(SparkCatalog.java:551) at org.apache.iceberg.spark.SparkSessionCatalog.buildSparkCatalog(SparkSessionCatalog.java:81) at org.apache.iceberg.spark.SparkSessionCatalog.initialize(SparkSessionCatalog.java:311) at org.apache.spark.sql.connector.catalog.Catalogs$.load(Catalogs.scala:65) at org.apache.spark.sql.connector.catalog.CatalogManager.$anonfun$catalog$1(CatalogManager.scala:53) at scala.collection.mutable.HashMap.getOrElseUpdate(HashMap.scala:86) at org.apache.spark.sql.connector.catalog.CatalogManager.catalog(CatalogManager.scala:53) at org.apache.spark.sql.connector.catalog.CatalogManager.currentCatalog(CatalogManager.scala:122) at org.apache.spark.sql.connector.catalog.CatalogManager.currentNamespace(CatalogManager.scala:93) at org.apache.spark.sql.internal.CatalogImpl.currentDatabase(CatalogImpl.scala:65) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.currentDB$1(SparkSQLCLIDriver.scala:285) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.promptWithCurrentDB$1(SparkSQLCLIDriver.scala:292) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:296) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.NoSuchMethodException: Cannot find constructor for interface org.apache.iceberg.io.FileIO Missing org.apache.iceberg.gcp.gcs.GCSFileIO [java.lang.ClassNotFoundException: org.apache.iceberg.gcp.gcs.GCSFileIO] at org.apache.iceberg.common.DynConstructors.buildCheckedException(DynConstructors.java:250) at org.apache.iceberg.common.DynConstructors.access$200(DynConstructors.java:32) at org.apache.iceberg.common.DynConstructors$Builder.buildChecked(DynConstructors.java:220) at org.apache.iceberg.CatalogUtil.loadFileIO(CatalogUtil.java:309) ... 30 more Suppressed: java.lang.ClassNotFoundException: org.apache.iceberg.gcp.gcs.GCSFileIO at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:476) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) at java.base/java.lang.Class.forName0(Native Method) at java.base/java.lang.Class.forName(Class.java:398) at org.apache.iceberg.common.DynConstructors$Builder.impl(DynConstructors.java:149) at org.apache.iceberg.CatalogUtil.loadFileIO(CatalogUtil.java:308) ... 30 more ``` Here is content of my spark conf ``` spark.sql.extensions org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions, spark.sql.catalog.qh_catalog org.apache.iceberg.spark.SparkSessionCatalog spark.sql.catalog.qh_catalog.type hive spark.sql.catalog.qh_catalog.io-impl org.apache.iceberg.gcp.gcs.GCSFileIO spark.sql.catalog.qh_catalog.warehouse gs://bq-bl-demo/hive-iceberg # set my catalog above as the default catalog for Spark spark.sql.defaultCatalog qh_catalog ``` The GCSFileIO clearly has a no-arg constructor. I wanted to write iceberg tables into GCS from local Spark. I haven't get this working end-to-end yet, and in the conf I was missing GCP-relevant properties such as project id, token etc, but I didn't expect to hit this Catalog loading issue before getting some properties-missing error. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
