findinpath opened a new issue #3951: URL: https://github.com/apache/iceberg/issues/3951
As a newbie on Apache Iceberg universe, I am eager to try out the functionality exposed by the framework. It is not quite straightforward to get to setup an Icerberg environment on Spark. After downloading the spark 3.1.2 distribution, I configured spark-defaults.conf ``` spark.jars.packages org.apache.iceberg:iceberg-spark3-runtime:0.12.1 spark.sql.extensions org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions spark.sql.catalog.demo org.apache.iceberg.spark.SparkCatalog spark.sql.catalog.demo.catalog-impl org.apache.iceberg.jdbc.JdbcCatalog spark.sql.catalog.demo.uri jdbc:postgresql://postgres:5432/demo_catalog spark.sql.catalog.demo.jdbc.user admin spark.sql.catalog.demo.jdbc.password password spark.sql.catalog.demo.io-impl org.apache.iceberg.hadoop.HadoopFileIO spark.sql.catalog.demo.warehouse /home/iceberg/warehouse spark.sql.defaultCatalog demo ``` Afterwards I did setup postgres to run on a docker container ``` docker run --name iceberg-spark-postgres -e POSTGRES_USER=admin -e POSTGRES_PASSWORD=password -e POSTGRES_DB=demo_catalog -p 5432:5432 -d postgres ``` While trying out the scenarios exposed on the page https://iceberg.apache.org/#maintenance/ it is mentioned in the code snippets: ``` Table table = ... ``` Getting the Iceberg table for a Spark Catalog is not that straightforward. After digging up though the Iceberg source code I stitched together this snippet for obtaining the table: ``` import org.apache.spark.sql.connector.catalog.Identifier val sparkCatalog = spark.sessionState.catalogManager.currentCatalog.asInstanceOf[org.apache.iceberg.spark.SparkCatalog] val sparkTableTest1 = sparkCatalog.loadTable(Identifier.of(Array[String](""), "test1")) val icebergTableTest1 = sparkTableTest1.table ``` What I'd like to have (as a newbie) on Iceberg is a Docker image / Docker compose to get started with Spark. Having everything packed together and ready to be used is much easier for a newbie to get started. For the code samples I'd very much appreciate having also the `SparkCatalog` in java/scala/python examples for a series of general usage scenarios that are not covered by SQL commands for Iceberg. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
