joyhaldar opened a new issue, #14422: URL: https://github.com/apache/iceberg/issues/14422
### Query engine Spark locally and GCP Dataproc Serverless on the cloud. ### Question _**Description:**_ I'm confused about the correct configuration properties for **BigQueryMetastoreCatalog** and believe there may be an inconsistency or documentation gap. _**Context:**_ When using BigQueryMetastoreCatalog on Google Dataproc, the configuration from [Google's official documentation](https://docs.cloud.google.com/biglake/docs/configure-blms) works fine: ``` spark.sql.catalog.my_catalog.gcp_project=PROJECT_ID spark.sql.catalog.my_catalog.gcp_location=LOCATION ``` However, when I try to use the Apache Iceberg JAR directly (e.g., iceberg:1.10.0, iceberg-bigquery:1.10.0, ) with Spark running elsewhere (outside Dataproc), this configuration doesn't work. **_Investigation:_** Looking at the Iceberg source code (**[BigQueryMetastoreCatalog.java](https://github.com/apache/iceberg/blob/main/bigquery/src/main/java/org/apache/iceberg/gcp/bigquery/BigQueryMetastoreCatalog.java)**), the properties are defined as: ``` public static final String PROJECT_ID = "gcp.bigquery.project-id"; public static final String GCP_LOCATION = "gcp.bigquery.location"; ``` This suggests the configuration should be: ``` spark.sql.catalog.my_catalog.gcp.bigquery.project-id=PROJECT_ID spark.sql.catalog.my_catalog.gcp.bigquery.location=LOCATION ``` **_Questions:_** 1. Why does the Google documentation approach (gcp_project, gcp_location) work on Dataproc but not with the standard Iceberg JAR? 2. Is there a configuration translation or aliasing layer in the Dataproc-provided JAR (gs://spark-lib/bigquery/iceberg-bigquery-catalog-1.6.1-1.0.1-beta.jar) that's not present in the Maven Central release? 3. What are the correct property names users should use when running Spark with Iceberg outside of Dataproc? 4. Should the Iceberg codebase support both property name formats for compatibility, or should Google's documentation be updated? **_Expected Behavior:_** Documentation on which property names to use, and ideally support for both formats to avoid user confusion. **_Actual Behavior:_** Users following Google's documentation may face issues when using Iceberg JARs from Maven Central outside of Dataproc environments. Also, please correct me if I am wrong and way off. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
