Hey Phillip,
Personally, I’m leaning toward following the Iceberg catalog properties as
they are today.
In the Iceberg SDK, there isn't a first-class property for project id, it's
currently passed through the `header.x-goog-user-project`, even biglake's
public doc follows this pattern.
I'd also prefer not to mix BigLake-specific properties with authentication
parameters since authentication is more generic, e.g., even for AWS Glue,
we don't introduce a special class for it but implement a generic
SigV4AuthenticationParameters. As I mentioned earlier, the project id is a
property required by the BigLake IRC, but it's not an authentication
parameter.
Given that, my proposal would be:
- Add an additional headers field in IcebergRestConnectionConfigInfo.
- I think it’s okay not to enforce checking the existence of this
property when the target catalog is GCP.
IcebergRestConnectionConfigInfo is
designed to connect to any Iceberg REST–compliant catalog, so if
a property
is only required by a specific catalog implementation, it shouldn’t be
validated generically at this layer.
- Introduce GcpAuthenticationParametersDpo, but limit it to two fields
for now that follows *GoogleAuthManager*:
- GCP_CREDENTIALS_PATH_PROPERTY: if this property is not provided,
it means that we will rely on gcp auth sdk to detect the gcp credentials
from env
- GCP_SCOPES_PROPERTY
Later, we can extend this to support impersonation flow and subscope logic
by adding the necessary service identity-related functionality.
Regards,
Rulin
On Wed, Mar 4, 2026 at 3:43 AM Phillip Henry <[email protected]>
wrote:
> Thanks, Rulin.
>
> Regarding points 1-3: would a step in the right direction be to call the
> DTO *BigLake*AuthenticationParametersDpo rather than *Gcp*
> AuthenticationParametersDpo? I'm not trying to solve all the possible GCP
> possibilities as I think that is beyond the scope of this ticket. Rather, I
> just wanted to connect to my BigLake metastore - something the code
> currently enables.
>
> If this were acceptable to you, I'd then argue that quotaProject is
> analogous to, say, SigV4AuthenticationParameters.signingRegion insofar as
> both basically define where the request is being sent (albeit one
> physically and one logically). If people disagree, I'm happy to put
> quotaProject elsewhere but I will need a steer on exactly where it should
> go.
>
> What I'm trying to avoid, however, is the scope of this ticket ballooning.
>
> Please let me know your thoughts.
>
> Regards,
>
> Phillip
>
>
>
> On Tue, Mar 3, 2026 at 6:47 PM Rulin Xing <[email protected]> wrote:
>
> > Hi Phillip,
> >
> > Thanks for starting this thread! I want to clarify a few points regarding
> > to my comment:
> >
> > 1. GoogleAuthManager.java#L76-L77
> > <
> >
> https://github.com/apache/iceberg/blob/1.10.x/gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java#L76-L77
> > >:
> > GoogleAuthManager only relies on these two properties for now:
> > gcp.auth.credentials-path and gcp.auth.scopes. google project id is
> not
> > needed by GoogleAuthManager
> > 2. We should clearly separate authentication parameters from other
> > connection configurations. IcebergRestConnectionConfigInfo is designed
> > to connect to any remote catalog that complies with the Iceberg REST
> > spec,
> > while GcpAuthenticationParametersDpo should include only Google
> > authentication–related properties. It should be generic enough to work
> > with
> > any catalog that uses Google Auth.
> > 3. Google Auth doesn't need the project id, it's GCP big lake that
> > requires this property:
> > https://docs.cloud.google.com/biglake/docs/blms-rest-catalog
> > - If we look at Polaris GCP storage config, we also don’t need to
> > provide a project ID, it only requires the gcs service account:
> > (which is
> > not a good design since this property should be provided by
> > catalog service
> > and shouldn't be provided by end users):
> >
> >
> https://github.com/apache/polaris/blob/release/1.3.x/spec/polaris-management-service.yml#L1162-L1171
> > - If customers host their catalog server on GCP and use Google Auth
> > for authentication, they do not need to specify a project id
> > when creating
> > the Polaris external catalog entity.
> > - For example, if they use GCP Proxy Service (API Gateway) to
> > expose REST APIs and use google auth, no project id is required.
> > - GCP BigLake requires both header.x-goog-user-project and
> warehouse
> > (catalog name). These are used to identify the catalog on the
> server
> > side
> > because BigLake supports multiplexing (it’s a multi-tenant
> > catalog service
> > where each tenant can have multiple catalogs). Neither of these
> > properties
> > is related to authentication.
> > - From a design perspective, this is not ideal. For comparison,
> in
> > Glue, a single property is used to achieve the same purpose,
> > e.g.,:
> > - warehouse=<aws_account_id>:s3tables/<catalog_name>
> > - Here, aws_account_id is equivalent to the Google project
> ID,
> > - and s3tables/<catalog_name> is equivalent to BigLake’s
> > warehouse.
> > 4. Another part that is not mentioned is the service identity
> > part, we can discuss this later
> >
> > Thanks.
> > Rulin
> >
> > On Mon, Mar 2, 2026 at 6:40 AM Phillip Henry <[email protected]>
> > wrote:
> >
> > > Regarding the points Rulin Xing raises on ticket 3729
> > > <https://github.com/apache/polaris/pull/3729#discussion_r2866294655>,
> I
> > > wanted to get some feedback from the community on the following.
> > >
> > >
> > > 1. I'd argue that project id property should not live on
> > > IcebergRestConnectionConfigInfo as it's GCP specific AFAIK
> > > 2. If it were to live in IcebergRestConnectionConfigInfo as a map of
> > > properties in (I think that's what you're suggesting RX - correct me
> > if
> > > I'm
> > > wrong) then its presence could not be enforced for GCP calls.
> > > 3. GcpAuthenticationParametersDpo "doesn't contain any
> > > authentication-related info" but it does contain only what is
> > necessary
> > > to
> > > trigger AuthManagers to use GCP and also to provide the means to do
> > that
> > > via its asIcebergCatalogProperties and
> asAuthenticationParametersModel
> > > methods. I'd argue that this way of doing it leverages the machinery
> > > that's
> > > already in place and minimally impacts the rest of the codebase.
> > >
> > >
> > > So, I'd propose removing the warehouse field as RX has helpfully
> pointed
> > > out but I think GcpAuthenticationParametersDpo should stay even if its
> > DTO
> > > counterpart is revised.
> > >
> > > Thoughts?
> > >
> >
>