Hi, Rulin. I've done as you've suggested with the exception of adding GCP_CREDENTIALS_PATH_PROPERTY and GCP_SCOPES_PROPERTY to GcpAuthenticationParametersDpo as they would not be used in this ticket and I would recommend adding them in another ticket that uses them.
Having said that, I don't feel strongly about this and am happy to add if you really want them in this ticket. Otherwise, could you please take another look at the PR? Regards, Phillip On Wed, Mar 4, 2026 at 6:58 PM Rulin Xing <[email protected]> wrote: > Hey Phillip, > > Personally, I’m leaning toward following the Iceberg catalog properties as > they are today. > > In the Iceberg SDK, there isn't a first-class property for project id, it's > currently passed through the `header.x-goog-user-project`, even biglake's > public doc follows this pattern. > > I'd also prefer not to mix BigLake-specific properties with authentication > parameters since authentication is more generic, e.g., even for AWS Glue, > we don't introduce a special class for it but implement a generic > SigV4AuthenticationParameters. As I mentioned earlier, the project id is a > property required by the BigLake IRC, but it's not an authentication > parameter. > > Given that, my proposal would be: > > - Add an additional headers field in IcebergRestConnectionConfigInfo. > - I think it’s okay not to enforce checking the existence of this > property when the target catalog is GCP. > IcebergRestConnectionConfigInfo is > designed to connect to any Iceberg REST–compliant catalog, so if > a property > is only required by a specific catalog implementation, it shouldn’t > be > validated generically at this layer. > - Introduce GcpAuthenticationParametersDpo, but limit it to two fields > for now that follows *GoogleAuthManager*: > - GCP_CREDENTIALS_PATH_PROPERTY: if this property is not provided, > it means that we will rely on gcp auth sdk to detect the gcp > credentials > from env > - GCP_SCOPES_PROPERTY > > Later, we can extend this to support impersonation flow and subscope logic > by adding the necessary service identity-related functionality. > > Regards, > Rulin > > On Wed, Mar 4, 2026 at 3:43 AM Phillip Henry <[email protected]> > wrote: > > > Thanks, Rulin. > > > > Regarding points 1-3: would a step in the right direction be to call the > > DTO *BigLake*AuthenticationParametersDpo rather than *Gcp* > > AuthenticationParametersDpo? I'm not trying to solve all the possible GCP > > possibilities as I think that is beyond the scope of this ticket. > Rather, I > > just wanted to connect to my BigLake metastore - something the code > > currently enables. > > > > If this were acceptable to you, I'd then argue that quotaProject is > > analogous to, say, SigV4AuthenticationParameters.signingRegion insofar as > > both basically define where the request is being sent (albeit one > > physically and one logically). If people disagree, I'm happy to put > > quotaProject elsewhere but I will need a steer on exactly where it should > > go. > > > > What I'm trying to avoid, however, is the scope of this ticket > ballooning. > > > > Please let me know your thoughts. > > > > Regards, > > > > Phillip > > > > > > > > On Tue, Mar 3, 2026 at 6:47 PM Rulin Xing <[email protected]> wrote: > > > > > Hi Phillip, > > > > > > Thanks for starting this thread! I want to clarify a few points > regarding > > > to my comment: > > > > > > 1. GoogleAuthManager.java#L76-L77 > > > < > > > > > > https://github.com/apache/iceberg/blob/1.10.x/gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java#L76-L77 > > > >: > > > GoogleAuthManager only relies on these two properties for now: > > > gcp.auth.credentials-path and gcp.auth.scopes. google project id is > > not > > > needed by GoogleAuthManager > > > 2. We should clearly separate authentication parameters from other > > > connection configurations. IcebergRestConnectionConfigInfo is > designed > > > to connect to any remote catalog that complies with the Iceberg REST > > > spec, > > > while GcpAuthenticationParametersDpo should include only Google > > > authentication–related properties. It should be generic enough to > work > > > with > > > any catalog that uses Google Auth. > > > 3. Google Auth doesn't need the project id, it's GCP big lake that > > > requires this property: > > > https://docs.cloud.google.com/biglake/docs/blms-rest-catalog > > > - If we look at Polaris GCP storage config, we also don’t need to > > > provide a project ID, it only requires the gcs service account: > > > (which is > > > not a good design since this property should be provided by > > > catalog service > > > and shouldn't be provided by end users): > > > > > > > > > https://github.com/apache/polaris/blob/release/1.3.x/spec/polaris-management-service.yml#L1162-L1171 > > > - If customers host their catalog server on GCP and use Google > Auth > > > for authentication, they do not need to specify a project id > > > when creating > > > the Polaris external catalog entity. > > > - For example, if they use GCP Proxy Service (API Gateway) to > > > expose REST APIs and use google auth, no project id is > required. > > > - GCP BigLake requires both header.x-goog-user-project and > > warehouse > > > (catalog name). These are used to identify the catalog on the > > server > > > side > > > because BigLake supports multiplexing (it’s a multi-tenant > > > catalog service > > > where each tenant can have multiple catalogs). Neither of these > > > properties > > > is related to authentication. > > > - From a design perspective, this is not ideal. For > comparison, > > in > > > Glue, a single property is used to achieve the same purpose, > > > e.g.,: > > > - warehouse=<aws_account_id>:s3tables/<catalog_name> > > > - Here, aws_account_id is equivalent to the Google project > > ID, > > > - and s3tables/<catalog_name> is equivalent to BigLake’s > > > warehouse. > > > 4. Another part that is not mentioned is the service identity > > > part, we can discuss this later > > > > > > Thanks. > > > Rulin > > > > > > On Mon, Mar 2, 2026 at 6:40 AM Phillip Henry <[email protected]> > > > wrote: > > > > > > > Regarding the points Rulin Xing raises on ticket 3729 > > > > <https://github.com/apache/polaris/pull/3729#discussion_r2866294655 > >, > > I > > > > wanted to get some feedback from the community on the following. > > > > > > > > > > > > 1. I'd argue that project id property should not live on > > > > IcebergRestConnectionConfigInfo as it's GCP specific AFAIK > > > > 2. If it were to live in IcebergRestConnectionConfigInfo as a map > of > > > > properties in (I think that's what you're suggesting RX - correct > me > > > if > > > > I'm > > > > wrong) then its presence could not be enforced for GCP calls. > > > > 3. GcpAuthenticationParametersDpo "doesn't contain any > > > > authentication-related info" but it does contain only what is > > > necessary > > > > to > > > > trigger AuthManagers to use GCP and also to provide the means to > do > > > that > > > > via its asIcebergCatalogProperties and > > asAuthenticationParametersModel > > > > methods. I'd argue that this way of doing it leverages the > machinery > > > > that's > > > > already in place and minimally impacts the rest of the codebase. > > > > > > > > > > > > So, I'd propose removing the warehouse field as RX has helpfully > > pointed > > > > out but I think GcpAuthenticationParametersDpo should stay even if > its > > > DTO > > > > counterpart is revised. > > > > > > > > Thoughts? > > > > > > > > > >
