Thanks, Rulin. Regarding points 1-3: would a step in the right direction be to call the DTO *BigLake*AuthenticationParametersDpo rather than *Gcp* AuthenticationParametersDpo? I'm not trying to solve all the possible GCP possibilities as I think that is beyond the scope of this ticket. Rather, I just wanted to connect to my BigLake metastore - something the code currently enables.
If this were acceptable to you, I'd then argue that quotaProject is analogous to, say, SigV4AuthenticationParameters.signingRegion insofar as both basically define where the request is being sent (albeit one physically and one logically). If people disagree, I'm happy to put quotaProject elsewhere but I will need a steer on exactly where it should go. What I'm trying to avoid, however, is the scope of this ticket ballooning. Please let me know your thoughts. Regards, Phillip On Tue, Mar 3, 2026 at 6:47 PM Rulin Xing <[email protected]> wrote: > Hi Phillip, > > Thanks for starting this thread! I want to clarify a few points regarding > to my comment: > > 1. GoogleAuthManager.java#L76-L77 > < > https://github.com/apache/iceberg/blob/1.10.x/gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java#L76-L77 > >: > GoogleAuthManager only relies on these two properties for now: > gcp.auth.credentials-path and gcp.auth.scopes. google project id is not > needed by GoogleAuthManager > 2. We should clearly separate authentication parameters from other > connection configurations. IcebergRestConnectionConfigInfo is designed > to connect to any remote catalog that complies with the Iceberg REST > spec, > while GcpAuthenticationParametersDpo should include only Google > authentication–related properties. It should be generic enough to work > with > any catalog that uses Google Auth. > 3. Google Auth doesn't need the project id, it's GCP big lake that > requires this property: > https://docs.cloud.google.com/biglake/docs/blms-rest-catalog > - If we look at Polaris GCP storage config, we also don’t need to > provide a project ID, it only requires the gcs service account: > (which is > not a good design since this property should be provided by > catalog service > and shouldn't be provided by end users): > > https://github.com/apache/polaris/blob/release/1.3.x/spec/polaris-management-service.yml#L1162-L1171 > - If customers host their catalog server on GCP and use Google Auth > for authentication, they do not need to specify a project id > when creating > the Polaris external catalog entity. > - For example, if they use GCP Proxy Service (API Gateway) to > expose REST APIs and use google auth, no project id is required. > - GCP BigLake requires both header.x-goog-user-project and warehouse > (catalog name). These are used to identify the catalog on the server > side > because BigLake supports multiplexing (it’s a multi-tenant > catalog service > where each tenant can have multiple catalogs). Neither of these > properties > is related to authentication. > - From a design perspective, this is not ideal. For comparison, in > Glue, a single property is used to achieve the same purpose, > e.g.,: > - warehouse=<aws_account_id>:s3tables/<catalog_name> > - Here, aws_account_id is equivalent to the Google project ID, > - and s3tables/<catalog_name> is equivalent to BigLake’s > warehouse. > 4. Another part that is not mentioned is the service identity > part, we can discuss this later > > Thanks. > Rulin > > On Mon, Mar 2, 2026 at 6:40 AM Phillip Henry <[email protected]> > wrote: > > > Regarding the points Rulin Xing raises on ticket 3729 > > <https://github.com/apache/polaris/pull/3729#discussion_r2866294655>, I > > wanted to get some feedback from the community on the following. > > > > > > 1. I'd argue that project id property should not live on > > IcebergRestConnectionConfigInfo as it's GCP specific AFAIK > > 2. If it were to live in IcebergRestConnectionConfigInfo as a map of > > properties in (I think that's what you're suggesting RX - correct me > if > > I'm > > wrong) then its presence could not be enforced for GCP calls. > > 3. GcpAuthenticationParametersDpo "doesn't contain any > > authentication-related info" but it does contain only what is > necessary > > to > > trigger AuthManagers to use GCP and also to provide the means to do > that > > via its asIcebergCatalogProperties and asAuthenticationParametersModel > > methods. I'd argue that this way of doing it leverages the machinery > > that's > > already in place and minimally impacts the rest of the codebase. > > > > > > So, I'd propose removing the warehouse field as RX has helpfully pointed > > out but I think GcpAuthenticationParametersDpo should stay even if its > DTO > > counterpart is revised. > > > > Thoughts? > > >
