Hi Phillip, Thanks for starting this thread! I want to clarify a few points regarding to my comment:
1. GoogleAuthManager.java#L76-L77 <https://github.com/apache/iceberg/blob/1.10.x/gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java#L76-L77>: GoogleAuthManager only relies on these two properties for now: gcp.auth.credentials-path and gcp.auth.scopes. google project id is not needed by GoogleAuthManager 2. We should clearly separate authentication parameters from other connection configurations. IcebergRestConnectionConfigInfo is designed to connect to any remote catalog that complies with the Iceberg REST spec, while GcpAuthenticationParametersDpo should include only Google authentication–related properties. It should be generic enough to work with any catalog that uses Google Auth. 3. Google Auth doesn't need the project id, it's GCP big lake that requires this property: https://docs.cloud.google.com/biglake/docs/blms-rest-catalog - If we look at Polaris GCP storage config, we also don’t need to provide a project ID, it only requires the gcs service account: (which is not a good design since this property should be provided by catalog service and shouldn't be provided by end users): https://github.com/apache/polaris/blob/release/1.3.x/spec/polaris-management-service.yml#L1162-L1171 - If customers host their catalog server on GCP and use Google Auth for authentication, they do not need to specify a project id when creating the Polaris external catalog entity. - For example, if they use GCP Proxy Service (API Gateway) to expose REST APIs and use google auth, no project id is required. - GCP BigLake requires both header.x-goog-user-project and warehouse (catalog name). These are used to identify the catalog on the server side because BigLake supports multiplexing (it’s a multi-tenant catalog service where each tenant can have multiple catalogs). Neither of these properties is related to authentication. - From a design perspective, this is not ideal. For comparison, in Glue, a single property is used to achieve the same purpose, e.g.,: - warehouse=<aws_account_id>:s3tables/<catalog_name> - Here, aws_account_id is equivalent to the Google project ID, - and s3tables/<catalog_name> is equivalent to BigLake’s warehouse. 4. Another part that is not mentioned is the service identity part, we can discuss this later Thanks. Rulin On Mon, Mar 2, 2026 at 6:40 AM Phillip Henry <[email protected]> wrote: > Regarding the points Rulin Xing raises on ticket 3729 > <https://github.com/apache/polaris/pull/3729#discussion_r2866294655>, I > wanted to get some feedback from the community on the following. > > > 1. I'd argue that project id property should not live on > IcebergRestConnectionConfigInfo as it's GCP specific AFAIK > 2. If it were to live in IcebergRestConnectionConfigInfo as a map of > properties in (I think that's what you're suggesting RX - correct me if > I'm > wrong) then its presence could not be enforced for GCP calls. > 3. GcpAuthenticationParametersDpo "doesn't contain any > authentication-related info" but it does contain only what is necessary > to > trigger AuthManagers to use GCP and also to provide the means to do that > via its asIcebergCatalogProperties and asAuthenticationParametersModel > methods. I'd argue that this way of doing it leverages the machinery > that's > already in place and minimally impacts the rest of the codebase. > > > So, I'd propose removing the warehouse field as RX has helpfully pointed > out but I think GcpAuthenticationParametersDpo should stay even if its DTO > counterpart is revised. > > Thoughts? >
