Hey Phillip,

Personally, I’m leaning toward following the Iceberg catalog properties as
they are today.

In the Iceberg SDK, there isn't a first-class property for project id, it's
currently passed through the `header.x-goog-user-project`, even biglake's
public doc follows this pattern.

I'd also prefer not to mix BigLake-specific properties with authentication
parameters since authentication is more generic, e.g., even for AWS Glue,
we don't introduce a special class for it but implement a generic
SigV4AuthenticationParameters. As I mentioned earlier, the project id is a
property required by the BigLake IRC, but it's not an authentication
parameter.

Given that, my proposal would be:

   - Add an additional headers field in IcebergRestConnectionConfigInfo.
      - I think it’s okay not to enforce checking the existence of this
      property when the target catalog is GCP.
IcebergRestConnectionConfigInfo is
      designed to connect to any Iceberg REST–compliant catalog, so if
a property
      is only required by a specific catalog implementation, it shouldn’t be
      validated generically at this layer.
   - Introduce GcpAuthenticationParametersDpo, but limit it to two fields
   for now that follows *GoogleAuthManager*:
      -   GCP_CREDENTIALS_PATH_PROPERTY: if this property is not provided,
      it means that we will rely on gcp auth sdk to detect the gcp credentials
      from env
      -   GCP_SCOPES_PROPERTY

Later, we can extend this to support impersonation flow and subscope logic
by adding the necessary service identity-related functionality.

Regards,
Rulin

On Wed, Mar 4, 2026 at 3:43 AM Phillip Henry <[email protected]>
wrote:

> Thanks, Rulin.
>
> Regarding points 1-3: would a step in the right direction be to call the
> DTO *BigLake*AuthenticationParametersDpo rather than *Gcp*
> AuthenticationParametersDpo? I'm not trying to solve all the possible GCP
> possibilities as I think that is beyond the scope of this ticket. Rather, I
> just wanted to connect to my BigLake metastore - something the code
> currently enables.
>
> If this were acceptable to you, I'd then argue that quotaProject is
> analogous to, say, SigV4AuthenticationParameters.signingRegion insofar as
> both basically define where the request is being sent (albeit one
> physically and one logically). If people disagree, I'm happy to put
> quotaProject elsewhere but I will need a steer on exactly where it should
> go.
>
> What I'm trying to avoid, however, is the scope of this ticket ballooning.
>
> Please let me know your thoughts.
>
> Regards,
>
> Phillip
>
>
>
> On Tue, Mar 3, 2026 at 6:47 PM Rulin Xing <[email protected]> wrote:
>
> > Hi Phillip,
> >
> > Thanks for starting this thread! I want to clarify a few points regarding
> > to my comment:
> >
> >    1. GoogleAuthManager.java#L76-L77
> >    <
> >
> https://github.com/apache/iceberg/blob/1.10.x/gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java#L76-L77
> > >:
> >    GoogleAuthManager only relies on these two properties for now:
> >    gcp.auth.credentials-path and gcp.auth.scopes. google project id is
> not
> >    needed by GoogleAuthManager
> >    2. We should clearly separate authentication parameters from other
> >    connection configurations. IcebergRestConnectionConfigInfo is designed
> >    to connect to any remote catalog that complies with the Iceberg REST
> > spec,
> >    while GcpAuthenticationParametersDpo should include only Google
> >    authentication–related properties. It should be generic enough to work
> > with
> >    any catalog that uses Google Auth.
> >    3. Google Auth doesn't need the project id, it's GCP big lake that
> >    requires this property:
> >    https://docs.cloud.google.com/biglake/docs/blms-rest-catalog
> >       - If we look at Polaris GCP storage config, we also don’t need to
> >       provide a project ID, it only requires the gcs service account:
> > (which is
> >       not a good design since this property should be provided by
> > catalog service
> >       and shouldn't be provided by end users):
> >
> >
> https://github.com/apache/polaris/blob/release/1.3.x/spec/polaris-management-service.yml#L1162-L1171
> >       - If customers host their catalog server on GCP and use Google Auth
> >       for authentication, they do not need to specify a project id
> > when creating
> >       the Polaris external catalog entity.
> >          - For example, if they use GCP Proxy Service (API Gateway) to
> >          expose REST APIs and use google auth, no project id is required.
> >       - GCP BigLake requires both header.x-goog-user-project and
> warehouse
> >       (catalog name). These are used to identify the catalog on the
> server
> > side
> >       because BigLake supports multiplexing (it’s a multi-tenant
> > catalog service
> >       where each tenant can have multiple catalogs). Neither of these
> > properties
> >       is related to authentication.
> >          - From a design perspective, this is not ideal. For comparison,
> in
> >          Glue, a single property is used to achieve the same purpose,
> > e.g.,:
> >             - warehouse=<aws_account_id>:s3tables/<catalog_name>
> >             - Here, aws_account_id is equivalent to the Google project
> ID,
> >             - and s3tables/<catalog_name> is equivalent to BigLake’s
> >             warehouse.
> >          4. Another part that is not mentioned is the service identity
> >    part, we can discuss this later
> >
> > Thanks.
> > Rulin
> >
> > On Mon, Mar 2, 2026 at 6:40 AM Phillip Henry <[email protected]>
> > wrote:
> >
> > > Regarding the points Rulin Xing raises on ticket 3729
> > > <https://github.com/apache/polaris/pull/3729#discussion_r2866294655>,
> I
> > > wanted to get some feedback from the community on the following.
> > >
> > >
> > >    1. I'd argue that project id property should not live on
> > >    IcebergRestConnectionConfigInfo as it's GCP specific AFAIK
> > >    2. If it were to live in IcebergRestConnectionConfigInfo as a map of
> > >    properties in (I think that's what you're suggesting RX - correct me
> > if
> > > I'm
> > >    wrong) then its presence could not be enforced for GCP calls.
> > >    3. GcpAuthenticationParametersDpo "doesn't contain any
> > >    authentication-related info" but it does contain only what is
> > necessary
> > > to
> > >    trigger AuthManagers to use GCP and also to provide the means to do
> > that
> > >    via its asIcebergCatalogProperties and
> asAuthenticationParametersModel
> > >    methods. I'd argue that this way of doing it leverages the machinery
> > > that's
> > >    already in place and minimally impacts the rest of the codebase.
> > >
> > >
> > > So, I'd propose removing the warehouse field as RX has helpfully
> pointed
> > > out but I think GcpAuthenticationParametersDpo should stay even if its
> > DTO
> > > counterpart is revised.
> > >
> > > Thoughts?
> > >
> >
>

Reply via email to