Hi Phillip,

Thanks for starting this thread! I want to clarify a few points regarding
to my comment:

   1. GoogleAuthManager.java#L76-L77
   
<https://github.com/apache/iceberg/blob/1.10.x/gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java#L76-L77>:
   GoogleAuthManager only relies on these two properties for now:
   gcp.auth.credentials-path and gcp.auth.scopes. google project id is not
   needed by GoogleAuthManager
   2. We should clearly separate authentication parameters from other
   connection configurations. IcebergRestConnectionConfigInfo is designed
   to connect to any remote catalog that complies with the Iceberg REST spec,
   while GcpAuthenticationParametersDpo should include only Google
   authentication–related properties. It should be generic enough to work with
   any catalog that uses Google Auth.
   3. Google Auth doesn't need the project id, it's GCP big lake that
   requires this property:
   https://docs.cloud.google.com/biglake/docs/blms-rest-catalog
      - If we look at Polaris GCP storage config, we also don’t need to
      provide a project ID, it only requires the gcs service account: (which is
      not a good design since this property should be provided by
catalog service
      and shouldn't be provided by end users):
      
https://github.com/apache/polaris/blob/release/1.3.x/spec/polaris-management-service.yml#L1162-L1171
      - If customers host their catalog server on GCP and use Google Auth
      for authentication, they do not need to specify a project id
when creating
      the Polaris external catalog entity.
         - For example, if they use GCP Proxy Service (API Gateway) to
         expose REST APIs and use google auth, no project id is required.
      - GCP BigLake requires both header.x-goog-user-project and warehouse
      (catalog name). These are used to identify the catalog on the server side
      because BigLake supports multiplexing (it’s a multi-tenant
catalog service
      where each tenant can have multiple catalogs). Neither of these
properties
      is related to authentication.
         - From a design perspective, this is not ideal. For comparison, in
         Glue, a single property is used to achieve the same purpose, e.g.,:
            - warehouse=<aws_account_id>:s3tables/<catalog_name>
            - Here, aws_account_id is equivalent to the Google project ID,
            - and s3tables/<catalog_name> is equivalent to BigLake’s
            warehouse.
         4. Another part that is not mentioned is the service identity
   part, we can discuss this later

Thanks.
Rulin

On Mon, Mar 2, 2026 at 6:40 AM Phillip Henry <[email protected]>
wrote:

> Regarding the points Rulin Xing raises on ticket 3729
> <https://github.com/apache/polaris/pull/3729#discussion_r2866294655>, I
> wanted to get some feedback from the community on the following.
>
>
>    1. I'd argue that project id property should not live on
>    IcebergRestConnectionConfigInfo as it's GCP specific AFAIK
>    2. If it were to live in IcebergRestConnectionConfigInfo as a map of
>    properties in (I think that's what you're suggesting RX - correct me if
> I'm
>    wrong) then its presence could not be enforced for GCP calls.
>    3. GcpAuthenticationParametersDpo "doesn't contain any
>    authentication-related info" but it does contain only what is necessary
> to
>    trigger AuthManagers to use GCP and also to provide the means to do that
>    via its asIcebergCatalogProperties and asAuthenticationParametersModel
>    methods. I'd argue that this way of doing it leverages the machinery
> that's
>    already in place and minimally impacts the rest of the codebase.
>
>
> So, I'd propose removing the warehouse field as RX has helpfully pointed
> out but I think GcpAuthenticationParametersDpo should stay even if its DTO
> counterpart is revised.
>
> Thoughts?
>

Reply via email to