Hi Dmitri,

Totally agree that we need to recognize the self-managed deployment case as a 
first-class scenario. That means we should provide a way to configure Polaris 
with long-lived credentials.

I see a couple of options for supporting this:
1. From env vars or server config, e.g.:
  * POLARIS_IAM_USER_AWS_ACCESS_KEY_ID
  * POLARIS_IAM_USER_AWS_SECRET_ACCESS_KEY
  * POLARIS_IAM_USER_ARN
In this case, `roleArn` would not be required.

2. Configured via the Polaris Management API: Stick to 
`SigV4AuthenticationParameters`

If we stick with the existing `SigV4AuthenticationParameters` type, we could:
* Make roleArn optional
* Add `iamUserAwsAccessKeyId` and `iamUserAwsSecretAccessKey` as optional fields

3. Configured via the Polaris Management API: Add new auth type

We could create a new type to distinguish clearly:
* New AuthenticationType enum: SIGV4_STS, SIGV4_STATIC_CREDS

4. Configured via the Polaris Management API: Add new auth types

We could create a new sub type to distinguish clearly:
e.g. new subtype under SigV4AuthenticationParameters: STS, CREDS

Personally, I would prefer option 4. WDYT?

I'll include these options in my PR as well for discussion.

Best,
Rulin


On 2025/05/02 17:16:44 Dmitri Bourlatchkov wrote:
> Thanks for your message, Rulin! You made good points and I agree with them.
> 
> I'm planning to introduce a `PolarisConnectionCredentialVendor`
> 
> 
> Looking forward to this proposal!
> 
> 
> The goal is to draw a clear boundary between user-provided input and
> Polaris-generated service info [...]
> 
> 
> I support this goal, however, I'd like to emphasise that there may be some
> skew in different deployment models.
> 
> Traditionally Polaris was envisioned as a service running for multiple
> users from distinct organisations, I guess. However, when Apache Polaris
> releases binary artifacts users will be able to run their own deployments.
> In that situation, the boundary between what is configured at the
> deployment level and what is configured via the Polaris Management API may
> not be as sharp.
> 
> I believe we need to recognise the self-managed deployment case and
> consider it as a mainstream case. I'm sure we're going to have some real
> users behind this use case soon.
> 
> Specifically for the SigV4 authentication option in Federated Catalogs, I
> guess this means that users may want to use simpler key/secret pairs as
> input for secure connections to AWS services like Glue. In self-managed
> deployments this is not a security risk, from my POV.
> 
> Would you consider it as a possible future enhancement?
> 
> If yes, do you think it would fall under the proposed
> SigV4AuthenticationParameters
> (as a set of new optional attributes perhaps)?.. or maybe be a different
> config type altogether? (this is related to my GH comment about type names,
> but the problem is bigger than just naming, I think).
> 
> I do not question that the STS / assume role path offers better security
> guarantees. My point is that it may still be valuable for OSS users to have
> simpler connection options.
> 
> Thanks,
> Dmitri.
> 
> On Thu, May 1, 2025 at 9:54 PM Rulin Xing <ru...@apache.org> wrote:
> 
> > Hi Dmitri,
> >
> > Thanks for the thoughtful questions!
> >
> > 1. Does this assume the use of STS?
> >
> > Yes, the current spec changes assume the use of STS. Polaris acts as a
> > service provider and assumes IAM roles provided by users to access AWS
> > resources like Glue Catalogs. This model avoids long-lived credentials and
> > enables secure, temporary access via STS-issued credentials.
> >
> > 2. Why is plain key/secret SigV4 not an option?
> >
> > We can support plain key/secret credentials for SigV4, particularly in
> > self-managed deployments where users own both the Polaris deployment and
> > AWS accounts. However, to reduce security risks, we don't want to store
> > long-lived credentials directly in the catalog entity. A more secure
> > approach is to reference them using `UserSecretReference` (added by
> > @dennishuo) and retrieve them through `UserSecretsManager`.
> >
> > 3. Where is Polaris expected to get credentials for STS requests?
> >
> > Polaris obtains credentials for STS calls from its own runtime
> > environment, such as server config, environment variables, or cloud-native
> > options like instance profiles. These are used to call AssumeRole on the
> > user-provided IAM role.
> >
> > To support both temporary and static credential workflows, I'm planning to
> > introduce a `PolarisConnectionCredentialVendor` (or
> > `PolarisCredentialManager`) interface. This class will:
> > * Provide Polaris-generated service info (what we call vendor info) such
> > as `userArn`, `externalId`, , `consentUrl`, or `gcsServiceAccount`, which
> > will be injected into the catalog entity's connection config / storage
> > config. This info is exposed to users when they load the catalog entity and
> > is needed for setting up the appropriate permissions (e.g., allowing
> > Polaris to assume roles).
> > * Retrieve temporary credentials from cloud providers (e.g., AWS STS,
> > Azure identity services) when needed to perform authenticated operations.
> >
> > The goal is to draw a clear boundary between user-provided input and
> > Polaris-generated service info (something that's currently unclear in
> > storage configs). In the long term, we're aiming to unify both connection
> > and storage credential handling in this interface to simplify the overall
> > architecture and improve security.
> >
> > Best,
> > Rulin
> >
> > On 2025/05/01 22:02:32 Dmitri Bourlatchkov wrote:
> > > Hi Rulin,
> > >
> > > Thanks for the informative description in the PR!
> > >
> > > It looks like the authentication method relies on STS. As such it is a
> > > sub-case of SigV4, I believe, because SigV4 can be used with plain
> > > key/secret credentials without assuming a role.
> > >
> > > If that is so, could you clarify that in the description?
> > >
> > > Is there any particular reason for not supporting plain key/secret
> > > credentials?
> > >
> > > When STS is in use, where is Polaris expected to get credentials for STS
> > > requests?
> > >
> > > Thanks,
> > > Dmitri.
> > >
> > > On Thu, May 1, 2025 at 5:37 PM Rulin Xing <ru...@apache.org> wrote:
> > >
> > > > Hi folks,
> > > >
> > > > Just wanted to surface a new API spec update proposal related to
> > Catalog
> > > > Federation:
> > > >
> > > > https://github.com/apache/polaris/pull/1506
> > > >
> > > > This adds support for AWS SigV4 authentication, enabling Polaris to
> > > > federate to external Iceberg REST catalogs hosted behind services like
> > AWS
> > > > Glue, S3Tables, or API Gateway.
> > > >
> > > > It builds on earlier federation work and introduces a set of
> > properties to
> > > > support role assumption and request signing via SigV4.
> > > >
> > > > Feedback on the spec or implementation is welcome!
> > > >
> > > > Best,
> > > > Rulin
> > > >
> > >
> >
> 

Reply via email to