KodaiD commented on code in PR #16512:
URL: https://github.com/apache/iceberg/pull/16512#discussion_r3281919700
##########
site/docs/spark-quickstart.md:
##########
@@ -284,44 +288,84 @@ To read a table, simply use the Iceberg table's name.
df = spark.table("demo.nyc.taxis").show()
```
-### Adding A Catalog
+### Configuring Catalogs
-Iceberg has several catalog back-ends that can be used to track tables, like
JDBC, Hive MetaStore and Glue.
-Catalogs are configured using properties under
`spark.sql.catalog.(catalog_name)`. In this guide,
-we use JDBC, but you can follow these instructions to configure other catalog
types. To learn more, check out
-the [Catalog](docs/latest/spark-configuration.md#catalogs) page in the Spark
section.
+Iceberg provides several catalog implementations to manage tables and enable
SQL operations.
+Catalogs are configured using properties under
`spark.sql.catalog.(catalog_name)`.
+You can configure different catalog types, such as JDBC, Hive Metastore, Glue,
and REST, to manage Iceberg tables in Spark.
-This configuration creates a path-based catalog named `local` for tables under
`$PWD/warehouse` and adds support for Iceberg tables to Spark's built-in
catalog.
+This guide covers the configuration of two popular catalog types: REST and
JDBC.
+To learn more, check out the
[Catalog](docs/latest/spark-configuration.md#catalogs) page in the Spark
section.
+
+#### Configuring REST Catalog
+
+The REST catalog provides a language-agnostic way to manage Iceberg tables
through a RESTful service. The following configuration creates a REST-based
catalog named `rest`, using the `apache/iceberg-rest-fixture` container from
the `docker-compose.yml` above as the REST server and MinIO for S3-compatible
storage:
=== "CLI"
```sh
- spark-sql --packages org.apache.iceberg:iceberg-spark-runtime-{{
sparkVersionMajor }}:{{ icebergVersion }}\
- --conf
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
\
- --conf
spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog \
- --conf spark.sql.catalog.spark_catalog.type=hive \
+ docker exec -it spark-iceberg spark-sql \
+ --conf spark.sql.catalog.rest=org.apache.iceberg.spark.SparkCatalog \
+ --conf spark.sql.catalog.rest.type=rest \
+ --conf spark.sql.catalog.rest.uri=http://rest:8181 \
+ --conf spark.sql.catalog.rest.warehouse=s3://warehouse/ \
+ --conf
spark.sql.catalog.rest.io-impl=org.apache.iceberg.aws.s3.S3FileIO \
+ --conf spark.sql.catalog.rest.s3.endpoint=http://minio:9000 \
+ --conf spark.sql.catalog.rest.s3.path-style-access=true \
+ --conf spark.sql.defaultCatalog=rest
+ ```
+
+=== "spark-defaults.conf"
+
+ ```sh
+ docker exec spark-iceberg bash -c 'cat >>
/opt/spark/conf/spark-defaults.conf <<EOF
+ spark.sql.catalog.rest
org.apache.iceberg.spark.SparkCatalog
+ spark.sql.catalog.rest.type rest
+ spark.sql.catalog.rest.uri http://rest:8181
+ spark.sql.catalog.rest.warehouse s3://warehouse/
+ spark.sql.catalog.rest.io-impl
org.apache.iceberg.aws.s3.S3FileIO
+ spark.sql.catalog.rest.s3.endpoint http://minio:9000
+ spark.sql.catalog.rest.s3.path-style-access true
+ spark.sql.defaultCatalog rest
+ EOF'
+ ```
+
+#### Configuring JDBC Catalog
+
+The following configuration creates a JDBC-based catalog named `local` for
tables under `/home/iceberg/warehouse`, using SQLite as the catalog backend.
+
+First, download the SQLite JDBC driver into the Spark jars directory:
+
+```sh
+docker exec spark-iceberg curl -sL -o /opt/spark/jars/sqlite-jdbc-3.53.1.0.jar
\
+
https://repo1.maven.org/maven2/org/xerial/sqlite-jdbc/3.53.1.0/sqlite-jdbc-3.53.1.0.jar
Review Comment:
Without this step, the following exception is thrown:
```
java.sql.SQLException: No suitable driver found for
jdbc:sqlite:/home/iceberg/warehouse/iceberg_catalog_db.sqlite
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]