Hi Dmitri, Totally agree that we need to recognize the self-managed deployment case as a first-class scenario. That means we should provide a way to configure Polaris with long-lived credentials.
I see a couple of options for supporting this: 1. From env vars or server config, e.g.: * POLARIS_IAM_USER_AWS_ACCESS_KEY_ID * POLARIS_IAM_USER_AWS_SECRET_ACCESS_KEY * POLARIS_IAM_USER_ARN In this case, `roleArn` would not be required. 2. Configured via the Polaris Management API: Stick to `SigV4AuthenticationParameters` If we stick with the existing `SigV4AuthenticationParameters` type, we could: * Make roleArn optional * Add `iamUserAwsAccessKeyId` and `iamUserAwsSecretAccessKey` as optional fields 3. Configured via the Polaris Management API: Add new auth type We could create a new type to distinguish clearly: * New AuthenticationType enum: SIGV4_STS, SIGV4_STATIC_CREDS 4. Configured via the Polaris Management API: Add new auth types We could create a new sub type to distinguish clearly: e.g. new subtype under SigV4AuthenticationParameters: STS, CREDS Personally, I would prefer option 4. WDYT? I'll include these options in my PR as well for discussion. Best, Rulin On 2025/05/02 17:16:44 Dmitri Bourlatchkov wrote: > Thanks for your message, Rulin! You made good points and I agree with them. > > I'm planning to introduce a `PolarisConnectionCredentialVendor` > > > Looking forward to this proposal! > > > The goal is to draw a clear boundary between user-provided input and > Polaris-generated service info [...] > > > I support this goal, however, I'd like to emphasise that there may be some > skew in different deployment models. > > Traditionally Polaris was envisioned as a service running for multiple > users from distinct organisations, I guess. However, when Apache Polaris > releases binary artifacts users will be able to run their own deployments. > In that situation, the boundary between what is configured at the > deployment level and what is configured via the Polaris Management API may > not be as sharp. > > I believe we need to recognise the self-managed deployment case and > consider it as a mainstream case. I'm sure we're going to have some real > users behind this use case soon. > > Specifically for the SigV4 authentication option in Federated Catalogs, I > guess this means that users may want to use simpler key/secret pairs as > input for secure connections to AWS services like Glue. In self-managed > deployments this is not a security risk, from my POV. > > Would you consider it as a possible future enhancement? > > If yes, do you think it would fall under the proposed > SigV4AuthenticationParameters > (as a set of new optional attributes perhaps)?.. or maybe be a different > config type altogether? (this is related to my GH comment about type names, > but the problem is bigger than just naming, I think). > > I do not question that the STS / assume role path offers better security > guarantees. My point is that it may still be valuable for OSS users to have > simpler connection options. > > Thanks, > Dmitri. > > On Thu, May 1, 2025 at 9:54 PM Rulin Xing <ru...@apache.org> wrote: > > > Hi Dmitri, > > > > Thanks for the thoughtful questions! > > > > 1. Does this assume the use of STS? > > > > Yes, the current spec changes assume the use of STS. Polaris acts as a > > service provider and assumes IAM roles provided by users to access AWS > > resources like Glue Catalogs. This model avoids long-lived credentials and > > enables secure, temporary access via STS-issued credentials. > > > > 2. Why is plain key/secret SigV4 not an option? > > > > We can support plain key/secret credentials for SigV4, particularly in > > self-managed deployments where users own both the Polaris deployment and > > AWS accounts. However, to reduce security risks, we don't want to store > > long-lived credentials directly in the catalog entity. A more secure > > approach is to reference them using `UserSecretReference` (added by > > @dennishuo) and retrieve them through `UserSecretsManager`. > > > > 3. Where is Polaris expected to get credentials for STS requests? > > > > Polaris obtains credentials for STS calls from its own runtime > > environment, such as server config, environment variables, or cloud-native > > options like instance profiles. These are used to call AssumeRole on the > > user-provided IAM role. > > > > To support both temporary and static credential workflows, I'm planning to > > introduce a `PolarisConnectionCredentialVendor` (or > > `PolarisCredentialManager`) interface. This class will: > > * Provide Polaris-generated service info (what we call vendor info) such > > as `userArn`, `externalId`, , `consentUrl`, or `gcsServiceAccount`, which > > will be injected into the catalog entity's connection config / storage > > config. This info is exposed to users when they load the catalog entity and > > is needed for setting up the appropriate permissions (e.g., allowing > > Polaris to assume roles). > > * Retrieve temporary credentials from cloud providers (e.g., AWS STS, > > Azure identity services) when needed to perform authenticated operations. > > > > The goal is to draw a clear boundary between user-provided input and > > Polaris-generated service info (something that's currently unclear in > > storage configs). In the long term, we're aiming to unify both connection > > and storage credential handling in this interface to simplify the overall > > architecture and improve security. > > > > Best, > > Rulin > > > > On 2025/05/01 22:02:32 Dmitri Bourlatchkov wrote: > > > Hi Rulin, > > > > > > Thanks for the informative description in the PR! > > > > > > It looks like the authentication method relies on STS. As such it is a > > > sub-case of SigV4, I believe, because SigV4 can be used with plain > > > key/secret credentials without assuming a role. > > > > > > If that is so, could you clarify that in the description? > > > > > > Is there any particular reason for not supporting plain key/secret > > > credentials? > > > > > > When STS is in use, where is Polaris expected to get credentials for STS > > > requests? > > > > > > Thanks, > > > Dmitri. > > > > > > On Thu, May 1, 2025 at 5:37 PM Rulin Xing <ru...@apache.org> wrote: > > > > > > > Hi folks, > > > > > > > > Just wanted to surface a new API spec update proposal related to > > Catalog > > > > Federation: > > > > > > > > https://github.com/apache/polaris/pull/1506 > > > > > > > > This adds support for AWS SigV4 authentication, enabling Polaris to > > > > federate to external Iceberg REST catalogs hosted behind services like > > AWS > > > > Glue, S3Tables, or API Gateway. > > > > > > > > It builds on earlier federation work and introduces a set of > > properties to > > > > support role assumption and request signing via SigV4. > > > > > > > > Feedback on the spec or implementation is welcome! > > > > > > > > Best, > > > > Rulin > > > > > > > > > >