SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION

I do not think that this option is a solution for self-managed deployments
at all.

It effectively disables credential vending, which is still a valuable
feature for self-managed cases.

So basically we'd have 2 modes of running Polaris [...]


I'd really like to avoid having "running modes" in the sense of having this
"mode" as a code-level config or flag.

I believe configuration options should provide enough controls to the admin
user to make Polaris behave in a certain way, but I believe those configs
should apply to specific aspects of Polaris behaviour as opposed to
defining an overarching "mode".

For example, subscoping for vended credentials is valuable, IMHO, even in
single-tenant deployments with a plain key/secret pair for authenticating
STS connections.

Cheers,
Dmitri.

On Mon, May 5, 2025 at 2:36 PM Dennis Huo <huoi...@gmail.com> wrote:

> In general this sigv4 indirection control-flow should mirror the analogous
> patterns we apply on the StorageConfigInfo side (and perhaps long-term we
> can better consolidate the STS logic for the two), so I'd agree it's not
> even necessarily federation-specific.
>
> There's some precedent for the use-case of a "self-run Polaris" user
> wanting to just use simple server-wide configuration for StorageConfigInfo
> already: SKIP_CREDENTIAL_SUBSCOPING_INDIRECTION
>
>
> https://github.com/apache/polaris/blob/4db7998381a61e9cab82cdc4fded6867b0bca464/service/common/src/main/java/org/apache/polaris/service/catalog/io/FileIOUtil.java#L92
>
> For this Catalog Federation sigv4 case we could introduce a similar feature
> configuration; whether or not this feature configuration is the exact way
> we want to do it long-term, it would make sense to refactor both the
> ConnectionConfig and StorageConfig parts together in the future.
>
> One important concept for this simple approach is that instead of getting
> into the business of having Polaris actually try to juggle long-lived
> credentials for IAM Users explicitly, this "simple case" can just inherit
> "environment-provided" credentials and let low-level SDK libraries use
> their default "credential chain" logic.
>
> So basically we'd have 2 modes of running Polaris:
>
> 1. Secure multi-tenant - Polaris will have opinionated/constrained
> scaffolding via layers of credential indirection, subscoping,
> secrets-management, etc.
> 2. Single-tenant - Polaris will be more hands-off in terms of secrets
> management, instead allowing thick clients to use typical
> "environment-provided" credentials (e.g. environment variables, EC2
> instance-metadata endpoint, local credential files, etc)
>
> On Fri, May 2, 2025 at 4:28 PM Dmitri Bourlatchkov <di...@apache.org>
> wrote:
>
> > I think this discussion moves slightly out of the scope of catalog
> > federation and into handling secrets :) ... but the points you're making
> > are quite valid.
> >
> > Let's keep them in mind when we reopen the secrets handling discussion.
> >
> > Cheers,
> > Dmitri.
> >
> > On Fri, May 2, 2025 at 7:04 PM Rulin Xing <ru...@apache.org> wrote:
> >
> > > Hi Dmitri,
> > >
> > > Totally agree that we need to recognize the self-managed deployment
> case
> > > as a first-class scenario. That means we should provide a way to
> > configure
> > > Polaris with long-lived credentials.
> > >
> > > I see a couple of options for supporting this:
> > > 1. From env vars or server config, e.g.:
> > >   * POLARIS_IAM_USER_AWS_ACCESS_KEY_ID
> > >   * POLARIS_IAM_USER_AWS_SECRET_ACCESS_KEY
> > >   * POLARIS_IAM_USER_ARN
> > > In this case, `roleArn` would not be required.
> > >
> > > 2. Configured via the Polaris Management API: Stick to
> > > `SigV4AuthenticationParameters`
> > >
> > > If we stick with the existing `SigV4AuthenticationParameters` type, we
> > > could:
> > > * Make roleArn optional
> > > * Add `iamUserAwsAccessKeyId` and `iamUserAwsSecretAccessKey` as
> optional
> > > fields
> > >
> > > 3. Configured via the Polaris Management API: Add new auth type
> > >
> > > We could create a new type to distinguish clearly:
> > > * New AuthenticationType enum: SIGV4_STS, SIGV4_STATIC_CREDS
> > >
> > > 4. Configured via the Polaris Management API: Add new auth types
> > >
> > > We could create a new sub type to distinguish clearly:
> > > e.g. new subtype under SigV4AuthenticationParameters: STS, CREDS
> > >
> > > Personally, I would prefer option 4. WDYT?
> > >
> > > I'll include these options in my PR as well for discussion.
> > >
> > > Best,
> > > Rulin
> > >
> > >
> > > On 2025/05/02 17:16:44 Dmitri Bourlatchkov wrote:
> > > > Thanks for your message, Rulin! You made good points and I agree with
> > > them.
> > > >
> > > > I'm planning to introduce a `PolarisConnectionCredentialVendor`
> > > >
> > > >
> > > > Looking forward to this proposal!
> > > >
> > > >
> > > > The goal is to draw a clear boundary between user-provided input and
> > > > Polaris-generated service info [...]
> > > >
> > > >
> > > > I support this goal, however, I'd like to emphasise that there may be
> > > some
> > > > skew in different deployment models.
> > > >
> > > > Traditionally Polaris was envisioned as a service running for
> multiple
> > > > users from distinct organisations, I guess. However, when Apache
> > Polaris
> > > > releases binary artifacts users will be able to run their own
> > > deployments.
> > > > In that situation, the boundary between what is configured at the
> > > > deployment level and what is configured via the Polaris Management
> API
> > > may
> > > > not be as sharp.
> > > >
> > > > I believe we need to recognise the self-managed deployment case and
> > > > consider it as a mainstream case. I'm sure we're going to have some
> > real
> > > > users behind this use case soon.
> > > >
> > > > Specifically for the SigV4 authentication option in Federated
> > Catalogs, I
> > > > guess this means that users may want to use simpler key/secret pairs
> as
> > > > input for secure connections to AWS services like Glue. In
> self-managed
> > > > deployments this is not a security risk, from my POV.
> > > >
> > > > Would you consider it as a possible future enhancement?
> > > >
> > > > If yes, do you think it would fall under the proposed
> > > > SigV4AuthenticationParameters
> > > > (as a set of new optional attributes perhaps)?.. or maybe be a
> > different
> > > > config type altogether? (this is related to my GH comment about type
> > > names,
> > > > but the problem is bigger than just naming, I think).
> > > >
> > > > I do not question that the STS / assume role path offers better
> > security
> > > > guarantees. My point is that it may still be valuable for OSS users
> to
> > > have
> > > > simpler connection options.
> > > >
> > > > Thanks,
> > > > Dmitri.
> > > >
> > > > On Thu, May 1, 2025 at 9:54 PM Rulin Xing <ru...@apache.org> wrote:
> > > >
> > > > > Hi Dmitri,
> > > > >
> > > > > Thanks for the thoughtful questions!
> > > > >
> > > > > 1. Does this assume the use of STS?
> > > > >
> > > > > Yes, the current spec changes assume the use of STS. Polaris acts
> as
> > a
> > > > > service provider and assumes IAM roles provided by users to access
> > AWS
> > > > > resources like Glue Catalogs. This model avoids long-lived
> > credentials
> > > and
> > > > > enables secure, temporary access via STS-issued credentials.
> > > > >
> > > > > 2. Why is plain key/secret SigV4 not an option?
> > > > >
> > > > > We can support plain key/secret credentials for SigV4, particularly
> > in
> > > > > self-managed deployments where users own both the Polaris
> deployment
> > > and
> > > > > AWS accounts. However, to reduce security risks, we don't want to
> > store
> > > > > long-lived credentials directly in the catalog entity. A more
> secure
> > > > > approach is to reference them using `UserSecretReference` (added by
> > > > > @dennishuo) and retrieve them through `UserSecretsManager`.
> > > > >
> > > > > 3. Where is Polaris expected to get credentials for STS requests?
> > > > >
> > > > > Polaris obtains credentials for STS calls from its own runtime
> > > > > environment, such as server config, environment variables, or
> > > cloud-native
> > > > > options like instance profiles. These are used to call AssumeRole
> on
> > > the
> > > > > user-provided IAM role.
> > > > >
> > > > > To support both temporary and static credential workflows, I'm
> > > planning to
> > > > > introduce a `PolarisConnectionCredentialVendor` (or
> > > > > `PolarisCredentialManager`) interface. This class will:
> > > > > * Provide Polaris-generated service info (what we call vendor info)
> > > such
> > > > > as `userArn`, `externalId`, , `consentUrl`, or `gcsServiceAccount`,
> > > which
> > > > > will be injected into the catalog entity's connection config /
> > storage
> > > > > config. This info is exposed to users when they load the catalog
> > > entity and
> > > > > is needed for setting up the appropriate permissions (e.g.,
> allowing
> > > > > Polaris to assume roles).
> > > > > * Retrieve temporary credentials from cloud providers (e.g., AWS
> STS,
> > > > > Azure identity services) when needed to perform authenticated
> > > operations.
> > > > >
> > > > > The goal is to draw a clear boundary between user-provided input
> and
> > > > > Polaris-generated service info (something that's currently unclear
> in
> > > > > storage configs). In the long term, we're aiming to unify both
> > > connection
> > > > > and storage credential handling in this interface to simplify the
> > > overall
> > > > > architecture and improve security.
> > > > >
> > > > > Best,
> > > > > Rulin
> > > > >
> > > > > On 2025/05/01 22:02:32 Dmitri Bourlatchkov wrote:
> > > > > > Hi Rulin,
> > > > > >
> > > > > > Thanks for the informative description in the PR!
> > > > > >
> > > > > > It looks like the authentication method relies on STS. As such it
> > is
> > > a
> > > > > > sub-case of SigV4, I believe, because SigV4 can be used with
> plain
> > > > > > key/secret credentials without assuming a role.
> > > > > >
> > > > > > If that is so, could you clarify that in the description?
> > > > > >
> > > > > > Is there any particular reason for not supporting plain
> key/secret
> > > > > > credentials?
> > > > > >
> > > > > > When STS is in use, where is Polaris expected to get credentials
> > for
> > > STS
> > > > > > requests?
> > > > > >
> > > > > > Thanks,
> > > > > > Dmitri.
> > > > > >
> > > > > > On Thu, May 1, 2025 at 5:37 PM Rulin Xing <ru...@apache.org>
> > wrote:
> > > > > >
> > > > > > > Hi folks,
> > > > > > >
> > > > > > > Just wanted to surface a new API spec update proposal related
> to
> > > > > Catalog
> > > > > > > Federation:
> > > > > > >
> > > > > > > https://github.com/apache/polaris/pull/1506
> > > > > > >
> > > > > > > This adds support for AWS SigV4 authentication, enabling
> Polaris
> > to
> > > > > > > federate to external Iceberg REST catalogs hosted behind
> services
> > > like
> > > > > AWS
> > > > > > > Glue, S3Tables, or API Gateway.
> > > > > > >
> > > > > > > It builds on earlier federation work and introduces a set of
> > > > > properties to
> > > > > > > support role assumption and request signing via SigV4.
> > > > > > >
> > > > > > > Feedback on the spec or implementation is welcome!
> > > > > > >
> > > > > > > Best,
> > > > > > > Rulin
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to